Skip to main content
SearchLogin or Signup

2. Data Cooperatives

Published onApr 30, 2020
2. Data Cooperatives


During the last decade, all segments of society have become increasingly alarmed by the amount of data, and resulting power, held by a small number of actors [1]. Data is, by some, famously called “the new oil” [2] and comes from records of the behavior of citizens. Why then, is control of this powerful new resource concentrated in so few hands? During the last 150 years, questions about concentration of power have emerged each time the economy has shifted to a new paradigm.

As the economy was transformed by industrialization and then by consumer banking, citizens felt trapped and exploited by powerful new companies. In order to provide a counterweight to these new powers, citizens joined together to form trade unions and cooperative banking institutions. Eventually the struggle reached a point where citizens felt that powerful players such as Standard Oil, J.P. Morgan, and a handful of others threatened freedom itself, and the Federal government instituted anti-trust laws, labor rights, and banking reform. The citizen organizations were key in helping to balance the economic and social power between large and small players and between employers and worker.

Today the same sort of citizen organizations can help us move from the current paradigm of individuals giving up data to large organizations to a system based on collective rights and accountability, with legal standards upheld by a new class of representatives who act as fiduciaries for their members. There are many examples of community organizations using community data to manage investments for the good of the community. For instance, beginning in 1943 the National Rural Electric Cooperative Association has electrified communities that cover over half of the US land area. Similarly, there are over 1,100 Community Development Financial Institutions, mainly small banks and credit unions, investing over $220B in community projects, including over 300 hundred focused primarily on economic, social and political justice. And with 100 million people members of credit unions, the opportunity for community organizations to leverage community-owned data is huge.

Indeed, with advanced computing technologies it is practical to automatically record and organize all the data that citizens knowingly or unknowingly give to companies and the government, and to store these data in community organization vaults. In addition, almost all of these community organizations already manage their accounts through regional associations that use common software, so widespread deployment of data cooperative capabilities could be surprisingly quick and easy.

Data Cooperatives as Citizens’ Organizations

The notion of a data cooperative refers to the voluntary collaborative pooling by individuals of their personal data for the benefit of the membership of the group or community. The motivation for individuals to get together and pool their data is driven by the need to share common insights across data that would be otherwise siloed or inaccessible. These insights provide the cooperative members as a whole with a better understanding of their current economic, health and social conditions as compared to the other members of the cooperative generally.

It is technically straightforward to have a third party such as a cooperative hold copies of their members’ data, in order to help them safeguard their rights, to represent them in negotiating how their data is used, to alert them to how they are being surveilled, and to audit the large companies and government institutions using their members’ data. In fact, the last half of this Chapter presents a blueprint for the processes and software required to accomplish these tasks.

Nor does creation of such data cooperatives require new laws; many community organizations are already chartered to manage member’s personal information for them. It does however require new regulation and oversight, similar to how the government has regulates and provides oversight of financial institutions. The last half of this Chapter and the final Chapter of this book discusses how this might happen in more detail.

It is critical to note that community organizations that manage members’ data must have fiduciary responsibilities to protect the sensitive information that they hold for members, as this is a central element in bringing data rights to the membership. This enables members to improve privacy and transparency as to data use and empower members to collectively direct the use of their data to their benefit.

Who will lead this historic, and necessary transformation? The answer could well grow out of current-day credit unions, many of which are directly associated with universities, city governments, trade unions, and the like. They are already chartered to represent their members in financial transactions and hold member’s data for them.

The ability to balance the world’s data economy depends on creating a balance of stakeholders. Today citizens and workers have no direct representation at the negotiation table, and so lose out. By leveraging cooperative worker and citizen organizations we can change this situation and create a sustainable digital economy that serves the many, and not just the few. The power of 100 million US consumers who control their data would be a force to be reckoned with by all organizations that use citizen data and would be a very decisive way to hold these organizations accountable. The same potential for community organizations to balance today’s data monoliths exists in most countries around the world.

Communities Using Their Data

What new advantages can communities have if they have the ability to analyze their data? People often think of monetizing personal data, but the reality is that while there is a great deal of value in aggregate data for specific purposes, there is no market mechanism for data exchange, and so personal data does not have very much value on an individual basis. Personal data, and community data, will only become a serious source of revenue when privacy-respecting data exchanges become a major part of the general financial and economic landscape. This is described in more detail in Chapter 3.

However, monetization is only a minor part of data’s value to a community, especially in today’s economic climate. A greater source of value is in improving the living conditions of the community members, and ensuring the success of future generations. For instance, the COVID-19 pandemic has highlighted major disparities in public health between different communities. Data about community public health is necessary to address these disparities, but today that data is unavailable to communities in all but the most general terms. Chapter 7 addresses this problem in more detail.

Economic Growth

Communities need data about their economic health in order to plan their future, but the data required for neighborhood-level planning is unavailable to them. Only the aggregate statistics of production and wage distribution used by economists are generally available. With the development of community-owned data cooperatives this could change dramatically.

As an example, Chong, Bahrami , Chen, Bozkaya and Pentland [3] recently developed a neighborhood attractiveness measure that uses the diversity of amenities within a neighborhood to predict the volume and diversity of human flows into that neighborhood, which in turn predicts economic productivity and economic growth on a neighborhood-by-neighborhood basis.

Their attractiveness measure is based on the relationship shown in Figure 1, which illustrates the connection between the number of unique shop categories in a neighborhood (defined in terms of the standardized merchant category code, MCC) and the total flows of people into each district over a period of 1 year. The very large correlation of 0.789 shows that measuring neighborhood attractiveness by diversity of shops and public spaces can be excellent predictor of future foot traffic. This data is from the city of Istanbul, but similar relationships have been demonstrated in the EU, US, and in Australia.

Figure 1 Scatter plot of the number of unique shop categories within a neighborhood versus total inflow (visitor) volumes of each neighborhood.

There is a dynamic relationship between the attractiveness of a neighborhood and its economic growth. The attractiveness of diverse amenities (e.g., parks and other public spaces), increases the inflow of people from different neighborhoods. This inflow in turn creates opportunities that boosts investments and increases the availability of even more diverse amenities.

The neighborhood attractiveness measure of [3] allows communities use their private data, specifically the pattern of in-store purchases, to predict what new stores and amenities will increase the economic productivity of the neighborhood. Moreover, as a neighborhood becomes more attractive, through new amenities and more diverse visitors, entrepreneurs respond by offering yet more diverse amenities in order to cater to the tastes and preferences of the new people visiting the neighborhood. Consequently, the same sort of community data can be used to predict future economic growth of the community.

Figure 2 (left) diversity of consumption, (right) year-on-year economic growth for neighborhoods within the city of Beijing. The diversity of consumption (or the diversity of visitors) predicts up to 50% of the variance in year-on-year economic growth for Beijing, as well as for US and EU cites [3].

Figure 2 illustrates the measured relationship between neighborhood attractiveness and the percentage changes in economic indicators for neighborhoods Beijing, similar results have been obtained on three different continents. In all three cases, we see that the diversity of consumption is a strong predictor of economic growth, with the correlation with the economic growth in the following year at 0.71 (Istanbul), 0.54 (Beijing) and 0.52 (U.S).

Figure 3 Diversity of consumption versus year-on-year growth in GDP after controlling for population density, housing price, and the geographical centrality [3].

However, economic growth is complex, and influenced by many factors. However, as shown in Figure 3, even if we also account for factors such as population density, housing price index, and the geographical centrality of the district within the city, the predictive ability of community diversity data with economic growth is still quite strong, with correlations of R=0.41 (Beijing), 0.72 (Istanbul), and 0.57 (US), providing evidence on how the attractiveness of local amenities and services is a strong determinant of neighborhood growth.

Small business planning

By using community data, we can begin to build more vibrant, economically successful neighborhoods. For instance, to promote growth in a specific neighborhood we can alter transportation networks to make the neighborhood accessible to more diverse populations, and invest in diverse stores and amenities in order to attract diverse flows of people.

Importantly, we can use community data to evaluate how to allocate investments to maximize the expected impact on the economy of the target neighborhood. Communities need not rely on annualized values of traditional economic indicators for planning purposes but would instead be able to make reliable estimates of what sort of stores will succeed, and determine whether or not they will contribute to the general prosperity of the neighborhood. For instance, Netto, Bahrami, Brei, Bozkaya, Balcısoy, and Pentland [4] have shown that by combining a generic model of how people move around the city (the “gravity model”) with community data describing the concentration and variety of amenities in the neighborhood they can accurately predict the foot traffic and sales of proposed stores and public anemities.

The method they developed is far better than existing methods, and is flexible and robust enough to estimate other key marketing variables, such as anticipated market share, units sold, or other forecasting goals. Consequently, community planners may use it for tax estimation purposes or to understand which type of new stores or community resources (parks, etc) can stimulate population flow towards different neighborhoods, plan the city dynamics and commercial growth, stimulating the flow of people into different areas to boost the local economy.


Communities also need to promote the jobs and skills that increase worker pay, create employment, and make their economy resilient to downturns. Moro, Frank, Pentland, Rutherford, Cebrian and Rahwan [5] have developed a skill connectivity measure for using community data to predict which skills will contribute most to the communities labor market resilience. This skill connectivity measure is an ecologically-inspired employment matching process constructed from the similarity of every occupation’s skill requirements [6].

Looking at all of the cities within the US, they found that this skill connectivity measure predicted the economic resilience of cities to economic downturns. The reason skill connectivity is so important is simple: if workers can easily move from one type of job to another because the two jobs share similar skills, then they are less likely to remain unemployed for long.

As illustrated in Figure 4, cities with greater skill connectivity experienced lower unemployment rates during the 2008 Great Recession, had increasing wage bills, and workers of occupations with high degree of connectivity within a city’s job network enjoy higher wages than their peers elsewhere. Skill connectivity, together with employment diversity, contributed the most toward lowering the unemployment rate during the 2007 Great Recession, as illustrated below.

Figure 4 More skills connectivity between jobs increases employment resilience [5].

Consequently, job training and economic development programs that promote skill overlap between the occupations within a community are likely to grow local labor markets and promote general economic resilience. Such job connectivity is also likely to be important in addressing technology-driven labor challenges, such as AI and robotic automation.

Building Social Capital

Central to any community is the trust and social capital within the community. Today many people have little trust in other members of their community, and this is the source of many problems including crime, poverty, and children’s developmental outcomes. Many lines of research show that one of the most reliable ways to create community trust and social capital is through cooperative community projects, and especially those that are community-owned (see Chapter 5), not just because such projects promote more communication and habits of cooperation within the community but also because they help give community members a sense of shared destiny and shared identity.

Extremely good measures of community trust and social capital can be derived from community data in a way that protects privacy, by looking at the frequency and diversity of within-community calls, messaging, and co-visiting (going to the same meetings, stores, parks, etc. at the similar times) [7]. For instance, in [7] we found we could use this measure to very accurately predict likelihood of giving help in time of sickness, willingness to loan money, and willingness to help with childcare.

Communities that talk together and build together are resilient and, over the long term, more successful. Knowing about the levels of trust in a community allows community leaders to prioritize projects that build more and more inclusive trust and social capital. Access to community-level data is what can make this possible.


Today we are in a situation where individual data assets ...people’s personal data... are being exploited without sufficient value being returned to the individual. This is analogous to the situation in the late 1800’s and early 1900’s that led to the creation of collective institutions such as credit unions and labor unions, and so the time seems ripe for the creation of collective institutions to represent the data rights of individuals.

We have argued that data cooperatives with fiduciary obligations to members provide a promising direction for the empowerment of individuals through collective use of their own personal data. Not only can a data cooperative give the individual expert, community-based advice on how to manage, curate and protect access to their personal data, it can run internal analytics that benefit the collective membership. Such collective insights provide a powerful tool for negotiating better services and discounts for its members, and for guiding investments that improve the community and member economic, health, and social conditions.


[6] M. R. Frank, et al., Toward understanding the impact of artificial intelligence on labor

Proceedings of the National Academy of Sciences p. 201900949 (2019). N. Aharony, W. Pan, C. Ip, I. Khayal, and A. Pentland. Social fmri: Investigating and shaping social mechanisms in the real world. Pervasive and Mobile Computing, 7(6):643–659, 2011

Tony Camero: general
Patrick Erichsen: - “Not only can a data cooperative give the individual expert, community-based advice on how to manage” + “Not only can a data cooperative give the individual expert community-based advice on how to manage”
Patrick Erichsen: - “maybe” + “may be”
Chey Barrett: duplicate
Chey Barrett: identify
Chey Barrett: obtained
Dave Kim: Thoughts on the account aggregator model that’s emerging in India?
Stephen Coller: Particularly if the insights are mediated by a 3rd party service provider. Unions might otherwise baulk at reputation risk and legal liability for actions taken obo membership. Some rechartering may mitigate these concerns.
Stephen Coller: correction: “members’ data”
Douglas Kim: One could posit this is also a cheaper faster way to comply even with CCPA: The California Attorney General’s Office estimates that CCPA compliance’s total cost to California businesses will start at $55 billion initially, then cost anywhere from $467 million to $16 billion over the next 10 years. Credit unions are poised to shoulder a significant share of these costs, as they will require an extensive review of every megabyte of personal data they process to ensure they remain compliant. They will also be required to create new internal security procedures and invest in staff training to avoid violation risks.
Douglas Kim: PYMNTS’ Credit Union Innovation Index found 65 percent of CU members chose credit unions as their primary financial institutions (FIs) because they trusted them, compared to 45 percent of non-CU members who said the same. It also revealed that 60.8 percent of the former said they would not leave their CUs for other FIs even if offered the same financial services — an indicator of how important trust is in influencing members’ decisions.
Bryan Wilson: sentence is repeated