Really big Twitter numbers: the flow of commotion/communication

This is the pitch:

DonateYourAccount.com is a project of Kyle Shank. In a blog post (9/3/2012), Labors of Love, he wrote about what he had hoped to accomplish, who has been using his system, and frustrations he has with the work.

The point of donate your account is the flow of communication. It is about spreading the word for supporters and messages shared for campaigns. The people who will donate their accounts are part of the flow. Their followers and friends are part of the flow. But the real point is increasing the flow of communication in campaigns.

This is a report about message flow. It is not volunteered message flow ala Kyle Shank. It is about the flow of messages during the last two nights of the Republican National Convention. The last two evenings of the Republican Convention were led by the vice presidential nominee Paul Ryan and the presidential nominee Mitt Romney. They were busy evenings on the floor. They were busy evenings on TV; Nielsen reported that 30.3 million watched Romney's address. (Fouhy, 9/5/2012) And they were busy evenings on Twitter. Twitter reported two million messages as "Ryan took the stage" and tweets hit four million "soon after the end of Governor Romney's speech."

Two million and four million are big numbers, but those are only the messages posted to Twitter. It does not connect the posting with followers who are the readers of the messages. The numbers give you broadcast but they do not close the communication by noting the flow as experienced by followers. They are a very partial counting of the flow of communication.

The data I am using was generously collected for me by Mike Jensen. His software accessed the Twitter sample api on Wednesday and Thursday evenings and asked for messages about Romney and Obama.

This is a study of the campaign as it was played out on Twitter during two nights of the Republican National Convention. One way to note that this was a campaign event rather than simply a convention event is: "This seat's taken," which was posted by the Obama campaign, was far and away the most frequently retweeted message of the convention period. It was repeated 24,673 times in the sample, which is more than three times as many as any other message was retweeted, and was available to 93 million. (Boynton, 9/6/2012)

The central question is: how much message flow was there?

Ryan's night

Wednesday night was Ryan's night. The sample collected that evening numbered 244,779 Twitter messages. Ryan was mentioned in 74,006 of them. Romney was mentioned in 100,223 of them. And Obama was mentioned in 166,446 of the sample of Twitter messages. Since the search terms were variations of Obama's and Romney's names that suggests the number for Ryan is an underestimate for the evening. That Obama was mentioned in 68% of the messages is a sign of attention. Obama was the object of attention. The attention, however, was at least as much negative as positive. This was the night for accusing Obama of every 'sin' Republicans could think of.

The number of Twitter messages is still about broadcast; it does not close the loop to estimate the communication flow. Assessing the communication flow requires information about the followers of the people posting the messages. At the end of 2010 Sysomos reported the number of followers of a large sample of Twitter users, and that makes a good comparison with the Twitter users posting messages during the convention.

0 to 5 followers
6 to 100 followers
101+ followers
Sysomos
32%
52%
16%
RNC messages
2.6%
25.6%
71.8%

These are dramatically different distributions. Only 2.6% of the 244,779 messages were going to five or fewer followers compared with 32% of all messages in 2010 going to 0 to 5 followers. The modal category for 2010 was 6 to 100 followers with 52% of users having that number. The modal category for the RNC sample was 101+ for 71.8% of all messages.

A second way to look at this distribution is to sort the messages for fewest to most followers, divide them into quintiles and compute the average number of followers per quintile.

Ryan's night mean followers by quintile
0 to 50,000
50,001 to 100,000
100,001 to 150,000
150,001 to 200,000
200,001 to 244,789
28
116
285
765
17,715

For the bottom fifth the average number of followers was 28. In the 50K to 100K quintile the average number of followers was 116. Then 285 and 756 and finally the top quintile that averaged 17,715 followers per. Even the lowest quintile had an average of followers substantially above 0 to 5. The top quintile has 100 user accounts with more than 1,000,000 followers. President Obama led the way with 19 million. CNN breaking news had 8.6 million, the New York Times had 6 million, Eva Longoria had 4.1 million,and Time.com had 3.8 million. That is followed by a large number of accounts with between two and three million followers. With a hundred above one million it is not hard to get to an average of 17.7 thousand.

Two points are apparent. One, this is a huge communication flow. If you sum the number of followers for each Twitter message the number is 852,966,232. That is 'filling the air' with communications. Two, the people posting messages to Twitter are followed by a remarkable number of user accounts. Even those with the fewest followers have more than one would expect of Twitter users.

Romney's night

Then it was time for Mr. Romney to accept the nomination of his party. The sample was 591,462 messages. Ryan was mentioned in 33,095. Romney was mentioned in 361,507 posts to Twitter. And Obama was mentioned in 309,395. Attention swung to Romney even if just barely.

The number of followers for posts to Twitter on Romney night are very similar to the night before.

 
0-5
6-100
100+
Romney night
2.4%
23.6%
74.1%

They are very close to the night before and they are very different from the Sysomos numbers for total Twitter users in 2010. Equally similar is the distribution when divided into quintiles.

Romney night mean followers by quintile
0 to 120,000
120,001 to 240,000
240,001 to 360,000
360,001 to 480,000
480,001 to 591,464
32
80
300
733
16,227

The average number of followers for the bottom quintile of messages posted was 32, and the range is up to 16,227. At the top there is a 1,500 follower drop from the night before, but the numbers are so enormous the comparable size rather than the difference seems the important point.

The sample for the night was 591,462 messages posted to Twitter. Romney was mentioned in 361,561. Twitter reported two million messages for the day. A very conservative estimate would be the 2 million split equally between the day and night. If that was the case the 361,561 would have been a one-third sample. When you sum the followers of the full set of posted messages the result is 1,952,040,498. If this was a one-third sample that number would need to be multiplied by three and the co-motion on that night was gigantic.

Correcting for fake accounts

There is one obvious correction that should be made. We know that many user accounts on Twitter are fake. (Wolford, 8/23/2012) Twitter is not the only social media with fake accounts, but these are Twitter numbers. We have no idea how big that is in messaging such as during the Republican convention. A small web firm, Status People, has developed a system to estimate fakers. It is based on sampling followers. They find the 100,000 most recent followers and take a sample of 1,000. (Waller, 8/22/2012) Based on the activity of each of the 1,000 they estimate fake accounts, inactive accounts, and good accounts. The criteria for the three categories are vague, at best. Accounts that follow many, have few followers, and post few tweets are considered fake. The other two categories are not described. Presumably, inactive means accounts that have few posts, few followers and follow few. Good would be accounts that post, follow and are followed. By their criteria the results for the accounts with the most followers in the Romney data set are:

 
follower count
fake
inactive
good
good+inactive
Obama -barackobama
19,068,078
21%
33%
46%
15,063,781
CNN Breaking News - cnnbrk
8619387
23%
47%
30%
6,636,927
New York Times - nytimes
6003374
30%
42%
28%
4,202,361
CNN - cnn
5778905
37%
38%
25%
3,640,710
Perez Hilton - PerezHilton
5501293
20%
44%
36%
4,401,034
Breaking News - breakingnews
4555200
14%
40%
46%
3,917,472
will.i.am - iamwill
4222143
15%
43%
42%
3,588,821
Eva Longoria - evalongoria
4158080
25%
45%
30%
3,118,560
Time.com - time
3831731
21%
45%
34%
3,027,067
Peter Cashmore - mashable
2987684
11%
40%
49%
2,659,038
Anderson Cooper - andersoncooper
2962324
19%
45%
36%
2,399,482
total
67,688,199
     
52,655,258

First, the percentages are based on a sample. Take a second sample and you get slightly different numbers. The fakes are 20% plus or minus 5% except for The New York Times and CNN, with 30% or more, and Peter Cashmore with 11%. The number of inactive accounts following CNN Breaking News, The New York Times, Time, and Anderson Cooper suggest an interpretation of this style of Twitter action. Inactive might be interpreted as the Twitter version of TV. You turn the TV on to watch. You do not turn the TV on to have your say. So many people must be doing exactly that with Twitter. They follow people to read what they have to say, and news media is high on their list of reading. It is, for them, mostly a news medium. There is another interpretation. They signed up and never returned. There must be many like that, but there is no way from this classification to determine how many of each of the possibilities are present. However, we can learn something about the limit on retweeting. If 20% +/- are fake and 40% +/- are observers then retweeting is limited to 'good' users who range from 25% to 49%.

It is telling that 46% of the president's followers are 'good' accounts. That is 8,771,315 accounts, which is a very big base for retweeting the message into an exponential explosion. But it is also telling because politics on Twitter has been found to be quite different from standard practice in so many ways.

If you throw out the fake accounts and recompute the number of followers the total goes from 67,688,199 to 52,655,258 for a drop of 22%. That leaves the total remarkably high.

How do the numbers for the most followed compare with people who have, comparatively, few followers? We do not know. I computed the scores for ten persons who had 1019 followers. The average number of fakes was 4.5%. The average number of inactive was 15.7%. And the average number of good followers was 77.8. A non-random selection of 10 persons hardly constitutes evidence. However, it does agree with expectations. If you have 1,000 followers you are not famous enough for bots to add fake accounts. So fake accounts should be low. With a thousand followers you are not famous enough to attract a large number of readers. And politics on Twitter is about interaction. That suggests that people who follow each other are going to be active.

But we know so little about this that one can only see this as a tiny start.

Consider

Imagine that you only need to reduce 1,952,040,498, which is the sum of followers for the Romney data, by twenty percent. Students of communication have never been in a position to consider such large numbers. But this is the message flow; it is in a very concrete way the communication of the night via Twitter. What should we make of this?

My argument is that we should understand this as the reconstruction of the public domain. We are moving from broadcast-audience to co-motion. This is the emerging. Twitter reported that in 2008 there were fewer messages posted during the campaign than were posted during the Republican convention. It is an explosion, and we are on the cusp. The reconstruction goes well beyond these numbers, as I have suggested elsewhere, but these numbers are part a change that is enormous. (Boynton, 8/24/2012 and Boynton and Richardson, 5/5/2012) What it will become we can hardly imagine.

References

Boynton, G. R. (9/6/2012) The reach of "This seat's taken"

Boynton, G. R. (8/24/2012) Voice and the Public Domain Becomes Co-Motion

Boynton, G. R. and Glenn W Richardson (5/5/2012) Reframing Audience; Co-Motion at #SOTU

Fouhy, Beth (9/5/2012) Republican Convention Ratings Plummet from 2008, Huff Post Media

Shank, Kyle (9/3/2012) Labors of Love

Sysomos blog (12/2010) Twitter Statistics for 2010

Twitter Blog (8/30/2012) A four million Tweet convention: That's a wrap for #GOP2012

Twitter Blog (8/29/2012) RNC Night Two: Paul Ryan takes the stage

Waller, Rob (8/22/2012) Fakers App Check Extended to 100,000 Followers, blog post Status People

Wolford, Josh (8/23/2012) Over 27% of the Top 10 Twitter Accounts' Followers Are Fake, WebProNews

© G. R. Boynton, September 10, 2012