Brainstorming is a crucial process for stimulating the generation of creative ideas, and it continues to be widely used today. Group brainstorming offers the advantage of obtaining diverse opinions from others, perspectives that may not arise in individual brainstorming sessions. However, group brainstorming is susceptible to decreased overall productivity due to three factors: The first is the phenomenon known as "Free riding" or "social loafing," where certain members overly rely on others, leading to a decrease in individual contributions. The second is "Social inhibition," which suppresses individual performance due to the presence of others. The third is "Production blocking," where individuals’ ideas are inhibited by other group members when presented. In this study, we focus on addressing the first two factors of "Free riding/social loafing" and "Social inhibition" by implementing a brainstorming support system. This system assigns the roles of others in group brainstorming to agents. By interacting with individuals using different functionalities, the agents mimic human group brainstorming, enabling individuals to enjoy the benefits of group brainstorming while mitigating the decrease in individual performance. We designed agents based on the concept of the IBIS structure (Issue, Idea, Pros, Cons). GPT-3.5-turbo was used for creating these agents. The four types of agents include (1) Those that freely generate ideas from the theme; (2) Those that generate ideas from other ideas; (3) Those that generate issues from ideas; and (4) Those that generate ideas from issues. Agents (2)-(4) have the function of replying to ideas and issues while prioritizing human posts. To validate the effectiveness of the agents, we conducted a comparative experiment using the bulletin board-style discussion platform D-Agree. We compared scenarios where brainstorming was conducted by humans alone (A), humans collaborated with agents (B), and agents alone (C). In scenario (A), two groups of three individuals each conducted separate brainstorming sessions on different themes. In scenario (B), individuals conducted brainstorming sessions with agents on themes they had not brainstormed in scenario (A). The results of the evaluation experiment show a tendency for the number of comments and ideas to increase per individual in scenario (B), where humans collaborated with agents, compared to scenario (A), where only humans participated. Moreover, the number of ideas and topics per brainstorming session was highest in scenario (B). However, these increases varied significantly among individuals. Furthermore, questionnaire results indicate a decrease in hesitation to contribute ideas and an increase in the ability to generate many ideas in scenario (B) compared to scenario (A). The significant differences observed in the increases in the number of comments per individual, the number of ideas per individual, the number of ideas per brainstorming session, and the number of topics per brainstorming session suggest the need for system improvements to ensure a consistent increase in the number of ideas, regardless of the user. Furthermore, additional experiments with increased sample sizes are needed to confirm the statistical significance of the results obtained in this study.
With the proliferation of AI, there is a growing concern regarding individuals becoming overly reliant on AI, leading to a decrease in intrinsic skills and autonomy. Assistive AI frameworks, on the other hand, also have the potential to improve human learning and performance by providing personalized learning experiences and real-time feedback. To study these opposing viewpoints on the consequences of AI assistance, we conducted a behavioral experiment using a dynamic decision-making game to assess how AI assistance impacts user performance, skill transfer, and cognitive engagement in task execution. Participants were assigned to one of four conditions that featured AI assistance at different time-points during the task. Our results suggest that AI assistance can improve immediate task performance without inducing human skill degradation or carryover effects in human learning. This observation has important implications for AI assistive frameworks as it suggests that there are classes of tasks in which assistance can be provided without risking the autonomy of the user. We discuss the possible reasons for this set of effects and explore their implications for future research directives.
Previous efforts to support creative problem-solving have included (a) techniques such as brainstorming and design thinking to stimulate creative ideas, and (b) software tools to record and share these ideas. Now, generative AI technologies can suggest new ideas that might never have occurred to the users, and users can then select from these ideas or use them to stimulate even more ideas. To explore these possibilities, we developed a system called Supermind Ideator that uses a large language model (LLM) and adds prompts, fine tuning, and a specialized user interface in order to help users reformulate their problem statements and generate possible solutions. This provides scaffolding to guide users through a set of creative problem-solving techniques, including some techniques specifically intended to help generate innovative ideas about designing groups of people and/or computers (“superminds”). In an experimental study, we found that people using Supermind Ideator generated significantly more innovative ideas than those generated by people using ChatGPT or people working alone. Thus our results suggest that the benefits of using LLMs for creative problem-solving can be substantially enhanced by scaffolding designed specifically for this purpose.
Any forecasting model can be represented by a virtual trader in a prediction market, endowed with a budget, risk preferences, and beliefs inherited from the model. We propose and implement a profitability test for the evaluation of forecasting models based on this idea. The virtual trader enters a position and adjusts its portfolio over time in response to changes in the model forecast and market prices, and its profitability can be used as a measure of model accuracy. We implement this test using probabilistic forecasts for competitive states in the 2020 US presidential election and congressional elections in 2020 and 2022, using data from three sources: model-based forecasts published by The Economist and FiveThirtyEight, and prices from the PredictIt exchange. The proposed approach can be applied more generally to any forecasting activity as long as models and markets referencing the same events exist.
In crowdsourcing, quality control is commonly achieved by having workers examine items and vote on their correctness. To minimize the impact of unreliable worker responses, a δ -margin voting process is utilized, where additional votes are solicited until a predetermined threshold δ for agreement between workers is exceeded. The process is widely adopted but only as a heuristic. Our research presents a modeling approach using absorbing Markov chains to analyze the characteristics of this voting process that matter in crowdsourced processes. We provide closed-form equations for the quality of resulting consensus vote, the expected number of votes required for consensus, the variance of vote requirements, and other distribution moments. Our findings demonstrate how the threshold δ can be adjusted to achieve quality equivalence across voting processes that employ workers with varying accuracy levels. We also provide efficiency-equalizing payment rates for voting processes with different expected response accuracy levels. Additionally, our model considers items with varying degrees of difficulty and uncertainty about the difficulty of each example. Our simulations, using real-world crowdsourced vote data, validate the effectiveness of our theoretical model in characterizing the consensus aggregation process. The results of our study can be effectively employed in practical crowdsourcing applications.
Keywords: crowdsourcing, labeling aggregation, majority voting, data quality control, fair remuneration, Markov random walk.
The threat of rapidly spreading health misinformation through social media during crises like COVID-19 emphasizes the importance of addressing both clear falsehoods and complex misinformation, including conspiracy theories and subtle distortions. This paper designs a novel tripartite collective intelligence approach that integrates deep neural networks (DNNs), large language models (LLMs), and crowdsourced human intelligence (HI) to collaboratively detect complex forms of public health misinformation on social media. Our design is inspired by the collaborative strengths of DNNs, LLMs, and HI, which complement each other. We observe that DNNs efficiently handle large datasets for initial misinformation screening but struggle with complex content and rely on high-quality training data. LLMs enhance misinformation detection with improved language understanding but may sometimes provide eloquent yet factually incorrect explanations, risking misinformation mislabeling. HI provides critical thinking and ethical judgment superior to DNNs and LLMs but is slower and more costly in misinformation detection. In particular, we develop TriIntel , a tripartite collaborative intelligence framework that leverages the collective intelligence of DNNs, LLMs, and HI to tackle the public health information detection problem under a novel few-shot and uncertainty-aware maximum likelihood estimation framework. Evaluation results on a real-world public health misinformation detection application related to COVID-19 show that TriIntel outperforms representative DNNs, LLMs, and human-AI collaboration baselines in accurately detecting public health misinformation under a diverse set of evaluation scenarios.