The Many Voices of ACTA
G. R. Boynton

ACTA or Anti-Counterfeiting Trade Agreement is 'foreign' slipping away. Global industries in entertainment, pharmaceuticals, agriculture, and probably many more have importuned the major western nations to establish a global body, which would control the nations' policies, to regulate for industry benefit. One of the many strands is control of the internet for the special benefit of the entertainment industry.

Much is not known about ACTA. Writing in Forbes E. D. Kain makes this point at some length. [Kain, 1/23/012] It has been negotiated in secret, and even though it is now being ratified very little is known about the provisions. But in at least some iterations it would require internet service providers to do the work of ferreting out illegal use of entertainment materials rather than the entertainment industry or government needing to do that work. This has inspired considerable opposition from those whose behavior would be controlled. A significant objection is the extent of surveillance that would be required just in case someone might mis-direct intellectual property. It is the kind of concern that grows out of news reports about what companies are finding out about consumers such as "How Target Figured Out a Teen Age Girl Was Pregnant Before Her Father Did. [Hill, 2/16/2012] That level of surveillance seems inappropriate to many people. However, surveillance is not the point of this note.

The point is to call attention to the globalization involved in this move. The industries are global and have been global for many decades. The nations are planning to set up a global agency that would be responsible for -- actually what it would be responsible for is not very well known. But it will be global. It will supersede what nations will do in these areas. And it has inspired a global citizen response, which is the many voices the analysis is about.

One should note that ACTA is not the only 'game in town.' In the United States it was SOPA that would involve the government requiring actions on the part of many industries either directly or peripherally involved in internet use and management. Industries and individuals rose up in horror as they learned about the provisions of the legislation. Industries called and individuals conducted an internet campaign, and congress backed down, but proponents do not give up easily. [Fulton, 2/16/2012] Canada had its own internet control legislation put forward that was bombed by industry and public. Legislators in Canada backed down in much the same way they did in the U.S. [Cottingham, 2/18/2012] ACTA is one among many efforts the global industries are pursuing to guarantee their control of production and dissemination of what they claim as their own.

ACTA is not limited to but is heavily dependent on the European nations agreeing to it -- both as nations and as the EU. It depends on nations first signing on and then ratifying the treaty. The signing was done for most of the nations in the fall of 2011. It is the beginning of 2012 when the ratifying is in motion. While there were protests earlier [Napolitano, 2/3/2012] the major protest was organized for the weekend of February 10, 11, and 12. Thousands of people rallied all over Europe despite weather that was not conducive to being out of doors. And there was a major protest on Twitter -- anti ACTA messages by the tens of thousands. The number of Twitter messages per day is shown in the figure.

I began collecting Twitter messages at 15:42 on the tenth. Had I been collecting all day there would have more though it is clear from the number from 15:42 to midnight that the protest was just getting off the ground. Both collection and translation procedures are discussed in the methodological appendix. The big burst was on Saturday the 11th. There were 75,592 messages that day containing ACTA. It dropped to 30,000 the next day stayed there for a second day, and then has slowly declined since. There was a small bump on Friday the 17th and it has declined from there.

This figure pictures speaking with one voice. The totals indicate the volume of the message stream concerned with the imposition of ACTA on the use of the internet. And as in the United States and Canada the politicians have begun to back down.

But they also spoke in many voices. There are many languages present in the stream of messages about ACTA. It is that distribution that I wanted to capture.

Well, it's simple. Just read them and identify the languages. Except my Polish is not too good. And there are one or two more that are beyond my language abilities. So another procedure was needed. Google to the rescue. Google makes its fortune being able to take questions in any language and give a response in any language. Well, I think it is 80 languages now. And they make the tools for identifying languages available in several ways. I used Google spreadsheet to have it identify the languages. I used the first 25,000 Twitter messages I collected and used Google spreadsheet to identify the language of each message.

Two features of the language use stand out for analysis.

First, there are estimates of the languages being used on the internet. They are estimates in terms of numbers of people rather than messages, but it is not unreasonable to compare this procedure of assessing internet use with the languages in use.

Top Ten Languages in Twitter Messages

English is at the top of both lists. English is the 'lingua franca' of the internet so many individuals who can write 140 characters in English will when they want to communicate with the world. The European focus of messages about ACTA shows up in the greater frequency of Twitter messages from European countries than is the frequency of their language use in general. German is second instead of China. Spanish is third on both measures. But French and Dutch are considerably more frequently used in messages about ACTA than the general language use. And Romanian, Italian, Polish and Croation are among the top ten languages used in messages about ACTA, but are not in the top ten for general language use. ACTA was negotiated and would be ratified by Western European nations so China, Japanese, Arabic, Russian, and Korean do not appear high among the language used to communicate about ACTA.

While the top ten languages are a substantial majority of language use many other languages were also used and thus contributing to the many voices of ACTA. The figure pictures the frequency of use in Twitter messages.

Languages Used in Opposition to ACTA in Lesser Number

Only two, Swedish and Danish, appear more than 100 times. But there is a broad representation of languages in the messages: Turkish, Serbian, Latvian, etc. Languages from outside the Western European realm re-enter at this level of participation. Russian, Arabic, Korean, Sudanese, Vietnamese and others are used in protesting ACTA even if at a very low level.

The many voices of ACTA are users of 63 languages expressing their opposition to the treaty via Twitter. And 250,000 messages strong they take to the internet to express their dismay at what ACTA would bring. The response was significant pulling back by the politicians who otherwise would have ratified the treaty while speaking for their countries.

The analysis takes us back to Schattschneider and Deutsch. Famously, Schattschneider said when you start a fight it attracts a crowd and the outcome is likely to be dependent on the distribution in the crowd. This certainly looks like that. ACTA was negotiated in secret so that a crowd would not discover it. But they had to go public, a fight ensued, and politicians caught in the middle seem to be counting votes instead of dollars at the moment. But the outcome is far from being determined. Dollars do have staying power while votes tend to be more transient.

Deutsch's central thesis was that when transactions across borders exceed transactions within borders a new global order is in the offing. The corporations have been global for decades. Monsanto, the music industry, the film industry know no borders -- if they can arrange it. So it is no surprise that they have supported a global 'solution' to what they consider their problems. What they have done, however, is generate a global opposition that is greatly facilitated by the new media available to the opponents. The opposition becomes a we in sharing opposition to the corporate grab for the internet, in needing to protest as the principal means of expressing concerns, and in sharing photos of each other carrying the banners of the fight. The supporters of ACTA are global. The opponents of ACTA are global. Pity the poor politicians who are trying to figure out how to do foreign policy, local, in this global world -- to their own benefit.

Methodological Appendix: Google spreadsheet and translation

I like Schattschneider and Deutsch. Any time I have the opportunity to write in their ideas I do it. But the major point of this small analysis is introducing technologies for studying new media and foreign policy.

New media are new; YouTube, Facebook, and Twitter serve as the public face of new media. There are, of course, many more media possibilities. Twitter should be of particular importance for political scientists because it is 250 million messages a day and they are, by default, public. A small minority of Twitter messages are about politics; a small minority of 250 million is still a large number of messages. For example, I am getting 100,000+ messages a day that mention Obama. That is more than several months ago, but this is February 2012 and I expect the number to increase as the November election approaches.

I introduced Archivist, which I use to collect Twitter messages, in X so there is no reason to repeat that here. I will describe using Google spreadsheet for doing translation work. The spreadsheet program has two functions that are helpful: =detectlanguage() and =GoogleTranslate(). You use them the same way that a function like =sum() is used.

Constraints on using Google spreadsheet: Users have a 1 gigabyte limit for free disk storage, and the free storage is used by all of one's Google accounts. If you store photos on Picassa, for example, that counts against the limit. Since Twitter messages can be captured in large numbers you may need to get more storage. You can check the size of the file or files to decide if you need more storage. Unfortunately, Google does not have a way to report how much space you are already using. The only way to find out that you do not have enough is to upload a file. If there is not enough space it will tell you after it has uploaded the file. To purchase more disk space login to Google docs. At the top right and corner you will see a spoked wheel. Click on it to go to settings. That has an entry for Storage, and you can click there and give them a bit of money for additional storage. While they take your money instantly they are slower to allocate more disk space, but they get to allocating space after a few hours.

A second constraint in using Google spreadsheet for analysis is an upper limit on the amount of information the spreadsheet program will accept. They say it is 400,000 cells. It seems that you could have a files with 100,000 Twitter messages [rows in the spreadsheet] with 4 columns. But cells is not exactly what they are measuring. If a cell contains the text of a Twitter message it counts as much more than a cell in the 400,000 calculation. You may need to experiment with the file you are using for analysis.

A third constraint is Google spreadsheets ability to page through a file. It seems to do calculations faster than it moves through a file for display on the screen. Getting from the top of the file to the bottom can be a slow process.

For this expample I took 25,000 messages, and deleted everything other than the text of the messages, which I saved as an xlxs [Excel] file type. It was small enough to fit both the disk storage and spreadsheet size requirement. It was considerably below both, but it took patience to get from the top to bottom.

Login to Google docs, click on upload, which is now a red symbol to the right of CREATE. If you upload a spreadsheet file type, xlxs for example, Google will recognize it is meant to be a spreadsheet and will translate it into its own spreadsheet format. When the uploading is finished you can click on the file name and it will open in the spreadsheet program.

I am going to assume some knowledge of using spreadsheets, but it does not require much. I had one column, A, with 25,000 rows, 1 to 25,000. In column B I typed =detectlanguage(a2). It identified the language of the Twitter message and printed a 2 character language code in the cell. The formula was displayed above the first row after fx. I selected the cell A2, went to Edit above and to the left, clicked and selected Copy. Then I selected the rest of the colum B. I selected column B and row 3, moved down to the bottom of the file, pressed the space bar and clicked in the last cell, which selected all of the rows, went to Edit, and selected Paste. Then it will, if the file is large, put Loading in the cell until it has worked its way down to the cells you can see. At that point you have a column of two character language identifiers. The file is saved. You can sort the column to see how many there are for each language. A list of 2 character country codes is appended to this note.

If you are going to do any analysis with a file of Twitter messages you can do the analysis with Google spreadsheet. However, it is not very fast compared with a desktop spreadsheet like Excel. So you may want to download the file for analysis. The file has been saved so you can close the tab of the browser. Then select the file with a check in the square to its left. Go up to More, click and select Download. It will ask if you want to save the file or open it with your default spreadsheet program. After that choice all you have to do is wait.

Translation works exactly the same. Put =googletranslate(a2) in a cell and it should translate the text in A2 and place the translation in the cell in which you wrote the formula.

The language of Twitter messages: Text in Twitter messages has a number of features that make identification of language and translation difficult.

First, words -- usernames are often part of a tweet. Usernames cause confusion for a dictionary lookup procedure. They do not exist in the dictionary. Acronyms and spelling designed to reduce space consumption in a Twitter message are not found in a dictionary lookup procedure.

Urls in Twitter messages take the form http://t.co/xxxxxxx. There is nothing in that string that Google can look up. Retweets, RT @username, also pose a problem.

The best thing to do is go through and delete as much of the not-words as you can. Acronyms are as bad as anything else. I had a number of Twitter messages that contained both SOPA and PIPA, and Google thought that required some language other than English, which the rest of the words were.

Google spreadsheet can be very helpful in identifying language and translating when you have many messages to work with. But you do have to set up your account for the work and you do have to clean up the text as much as possible.

References

Cottingham, Rob (2/18/2012) Cartoon: Either Stand With Us or With Those Internet Geeks, ReadWriteWeb

Fulton, Scott M. (2/16/2012) On Second Thought, Senate Won't Debate Cybersecurity Before Floor Vote, ReadWrite

Hill, Kashmir (2/16/2012) How Target Figured Out a Teen Girl Was Pregnant Before Her Father Did, Forbes

Internet World Stats, Internet World Users by Language; Top 10 Languages

Kain, E. D. (1/23/2011) If You Thought SOPA was Bad, Just Wait Until You Meet ACTA, Forbes

Napolitano, Antonella (2/3/2012) Slovenian ambassador apologizes for signing ACTA, Poland halts ratification, TECHPRESIDENT

Two character country codes

No.
Language Name Native Language Name Code
1
Afrikaans
Afrikaans
af
3
Arabic
عربي
ar
5
Azerbaijani
آذربایجان دیلی
az
7
Belarusian
Беларуская
be
8
Bulgarian
Български
bg
9
Catalan
Català
ca
13
Czech
Čeština
cs
57
Welsh
Cymraeg
cy
14
Danish
Dansk
da
23
German
Deutsch
de
24
Greek
Ελληνικά
el
16
English
English
en
49
Spanish
Español
es
17
Estonian
Eesti keel
et
6
Basque
Euskara
eu
41
Persian
فارسی
fa
19
Finnish
Suomi
fi
20
French
Français
fr
31
Irish
Gaeilge
ga
21
Galician
Galego
gl
27
Hindi
हिन्दी
hi
12
Croatian
Hrvatski
hr
25
Haitian Creole
Kreyòl ayisyen
ht
28
Hungarian
Magyar
hu
4
Armenian
Հայերէն
hy
30
Indonesian
Bahasa Indonesia
id
29
Icelandic
Íslenska
is
32
Italian
Italiano
it
26
Hebrew
עברית
iw
33
Japanese
日本語
ja
22
Georgian
ქართული
ka
34
Korean
한국어
ko
36
Lithuanian
Lietuvių kalba
lt
35
Latvian
Latviešu
lv
37
Macedonian
Македонски
mk
38
Malay
Malay
ms
39
Maltese
Malti
mt
15
Dutch
Nederlands
nl
40
Norwegian
Norsk
no
42
Polish
Polski
pl
43
Portuguese
Português
pt
44
Romanian
Română
ro
45
Russian
Русский
ru
47
Slovak
Slovenčina
sk
48
Slovenian
Slovensko
sl
2
Albanian
Shqip
sq
46
Serbian
Српски
sr
51
Swedish
Svenska
sv
50
Swahili
Kiswahili
sw
52
Thai
ไทย
th
18
Filipino
Filipino
tl
53
Turkish
Türkçe
tr
54
Ukrainian
Українська
uk
55
Urdu
اردو
ur
56
Vietnamese
Tiếng Việt
vi
58
Yiddish
ייִדיש
yi
10
Chinese (Simplified)
中文简体
zh-CN
11
Chinese (Traditional)
中文繁體
zh-TW

© G. R. boynton, February 22, 2012