This final blog in our Seedlist Series describes the maintenance work needed for the seedlist. We recommend having a scheduled review of the seedlist on a regular basis as well as having a channel for ad-hoc updates requested by end-users. If your collection includes elected officials, you should schedule a review/update of the seedlist after each major election.
Identifying updates and corrections
Entities are not static. An entity may change some of its characteristics over time, and these changes must be reflected in the seedlist to avoid becoming outdated. For example, the results of an election may change an entity’s type from “politician” to “influencer” or vice versa. Among demographic characteristics, temporal traits such as role or location are the most likely to require updates. Social media handles can also require updates, due to account suspensions, deactivations, an entity migrating to a new account. On the other hand, inaccuracies can also originate from the initial data entry itself. All of these scenarios would require an entity’s information to be updated or corrected in the seedlist.
A seedlist update process is usually triggered in two ways: user-reported issues and planned updates.
User-reported issues are small scale updates where we are notified of outdated or incorrect information. For this reason, it is recommended to have a clear method of contact so that users can flag discrepancies in the seedlist. It is also useful to assign priority levels to flagged issues so that urgent corrections are handled first while lower priority corrections can be handled as a batch.
Planned updates may be triggered by external events or by recurring maintenance tasks. Certain external events, such as elections, can systematically alter entity attributes. In this case, you should allocate appropriate time and resources to review all seeds affected and record any changes that occurred. Historically, it has taken us around a week to review all changes after a provincial election and around 3 weeks for a federal election.1
Planned updates also include recurring maintenance tasks. Because social media accounts come and go, account deactivations are one of the most common changes we observe. This happens when an entity deletes their social media account or simply stops posting. Deactivating these accounts in the seedlist keeps it up-to-date and saves crawler time spent on deactivated accounts. We suggest defining an inactivity threshold after which an account is considered deactivated. Once the threshold is established, you can either automatically flag accounts for deactivation on an ad-hoc basis or review all accounts that meet this threshold at a regular interval. In our case, we define a deactivated account as one with 12 consecutive months of inactivity and we conduct an annual review of all accounts that meet this threshold.
Recording changes
Once a change is identified, there are several options to record it, depending on the level of traceability required. The simplest option would be to directly overwrite the seedlist. While your seedlist will be up to date, you will not be able to track what changes happened and when. The next option is to overwrite the seedlist while also keeping previous versions with a timestamp. This will allow you to track changes, although some work may be needed to reconstruct entity history. The most comprehensive option is to build a seedlist as a temporal dataset. In this model, each temporal entity state is treated as its own record with defined validity periods. When any attribute changes, the existing record expires and a new one is created.
Within our team, we implement the last option using a structure we call the “Platform Handle History” (PHH). The format of this table was developed internally to log changes at the entity level and individual handle level. We plan to cover the design of the PHH in more detail in a separate technical guide.
Regularly reviewing and updating your seedlist helps preserve the integrity and accuracy of your data over time. This final post in our Seedlist Series shows how ongoing maintenance supports the broader goal of understanding online-communities more fully and going beyond what platform-specific tools alone can capture.
- This may have taken us more time because we also collected data for all candidates. It would take less time if you only update the elected officials. ↩︎

Leave a Reply