Welcome to the first blog in our Seedlist Series! This series will consist of blog posts dedicated to explaining what a seedlist is and how to create your own.
Most social media research studies conversations. Few systematically study communities. Enter the seedlist: an entity-centered method that allows researchers to follow how communities behave online across multiple platforms and over long periods of time. In this post, we will break down how a seedlist works and why it matters.
For over 20 years, social scientists have studied social media data to tackle a wide range of questions, ranging from its impact on climate discourse (Schafer & Painter, 2020) to how it has increased actors involved in the election process (Garnett & James, 2020). Much of the existing social media research focuses on the use of URLs, hashtags, and keywords to gather and analyze data. Yarchi and colleagues (2020), for instance, investigated online political polarization around a political controversy in the West Bank. They analyzed Facebook, X/Twitter, and WhatsApp data for 16 months; using keyword searches on Facebook and X/Twitter, and public groups on WhatsApp. This analysis allowed for a better understanding of political polarization on social media, but also highlighted the need for more studies examining polarization in a variety of contexts and social media platforms. Other studies have used URLs to analyze the social media ecosystem, such as Di Martino and colleagues (2024) who investigated political polarization on social media and fragmentation in the social media ecosystem. Overall, these works contributed to understanding the modern information ecosystem, providing insights into internet fragmentation and societal polarization.
While these methods allow us to better understand popular topics and emerging conversations, they often overlook nuanced aspects of online networks. Existing research methods analyze conversations through keywords and hashtags rather than communities by following entities, such as people and groups. Entity-centered social media research that spans multiple years remains rare, leaving an important gap in the literature.
Our proposed method of data collection fills this gap by first identifying the relevant players and communities in the form of a seedlist and then recording their social media activities through historical and ongoing data collection. This results in a multi-platform, longitudinal database of social media posts that enables researchers to track how communities evolve over time and compare behavior between platforms. Furthermore, the entities can act as nodes linking data from different platforms, facilitating the study of networks across platforms. In our case, building a comprehensive seedlist that contains all the major players in the Canadian social media ecosystem provided us with a holistic view of the information landscape that affects Canadians. We therefore present the seedlist as a crucial tool for social media research.
The seedlist is useful for social media analysis that:
- Analyzes a specific set of entities over time
- Analyzes entity activities and relationships across multiple platforms using normalized metrics (to enable cross-platform comparison)
A good example of an analysis project that meets these conditions is our analysis on the impact of Meta’s news ban on Canadian news organizations. We built a seedlist that included 773 news outlets in Canada (and all their social media handles) and studied their overall behavior before and after Meta’s news ban went into effect. Our seedlist also allowed us to study the difference in news outlet behavior between Meta platforms and non-Meta platforms.
On the other hand, since the seedlist is entity-centric, the seedlist is not necessary for analysis that focuses on the digital content rather than the entities that produce it. It may also be disproportionate to the scope of analyses that examine only a very specific point in time.
The seedlist is not necessary for social media analysis that:
- Only analyzes content that is entity-agnostic, such as hashtag, keyword, or topic analysis.
- Carries out cross-sectional analysis without a temporal dimension.
As social media ecosystems continue to evolve, researchers need tools that allow them to examine communities across platforms and over time. The seedlist offers a practical, scalable solution for building multi-platform datasets. In upcoming posts, we will walk through exactly how to build and maintain a seedlist.
Bibliography
Di Martino, Edoardo, Alessandro Galeazzi, Michele Starnini, Walter Quattrociochi, Matteo Cinelli. 2024. “Characterizing the Fragmentation of the Social Media Ecosystem”. https://arxiv.org/abs/2411.16826
Garnett, Holly Ann, and Toby S. James. 2020. “Cyber Elections in the Digital Age: Threats and Opportunities of Technology for Electoral Integrity”. Election Law Journal: Rules, Politics, and Policy 19(2): 111-261. https://doi.org/10.1089/elj.2020.0633
Schafer, Mike S., and James Painter. 2020. “Climate Journalism in a Changing Media Ecosystem: Assessing the Production of Climate Change-Related News Around the World”. WIREs Climate Change 12, (1): 1-20. https://doi.org/10.1002/wcc.675
Yarchi, Moran, Christian Baden, and Neta Kligler-Vilenchik. 2020. “Political Polarization on the Digital Sphere: A Cross-Platform, Over-Time Analysis of Interactional, Positional, and Affective Polarization on Social Media.” Political Communication 38 (1–2): 98–139. https://doi.org/10.1080/10584609.2020.1785067

Leave a Reply