Research article

Key opportunities and challenges for the use of big data in migration research and policy

Authors
  • Lydia H. V. Franklinos orcid logo (Institute for Global Health, University College London, London, UK)
  • Rebecca Parrish orcid logo (Institute for Global Health, University College London, London, UK)
  • Rachel Burns orcid logo (Centre of Public Health Data Science, Institute of Health Informatics, University College London, London, UK)
  • Andrea Caflisch (United Nations’ Displacement Tracking Matrix, International Organization for Migration, International Organization for Migration, Juba, South Sudan)
  • Bishawjit Mallick orcid logo (CU Population Center, Institute of Behavioral Science, University of Colorado Boulder Campus, Boulder, CO, USA)
  • Taifur Rahman (Health Management BD Foundation, Sector 6, Uttara, Dhaka, Bangladesh)
  • Vasileios Routsis orcid logo (Department of Information Studies, University College London, London, UK)
  • Ana Sebastián López (GMV Innovating Solutions Ltd, HQ Building, Thomson Avenue, Harwell Campus, Didcot, UK)
  • Andrew J. Tatem orcid logo (WorldPop, School of Geography and Environmental Science, University of Southampton, Southampton, UK)
  • Robert Trigwell (United Nations’ Displacement Tracking Matrix, International Organization for Migration, United Nations, London, UK)

Abstract

Migration is one of the defining issues of the 21st century. Better data is required to improve understanding about how and why people are moving, target interventions and support evidence-based migration policy. Big data, defined as large, complex data from diverse sources, is regularly proposed as a solution to help address current gaps in knowledge. The authors participated in a workshop held in London, UK, in July 2019, that brought together experts from the United Nations (UN), humanitarian non-governmental organisations (NGOs), policy and academia to develop a better understanding of how big data could be used for migration research and policy. We identified six key areas regarding the application of big data in migration research and policy: accessing and utilising data; integrating data sources and knowledge; understanding environmental drivers of migration; improving healthcare access for migrant populations; ethical and security concerns around the use of big data; and addressing political narratives. We advocate the need for careful consideration of the challenges faced by the use of big data, as well as increased cross-disciplinary collaborations to advance the use of big data in migration research whilst safeguarding vulnerable migrant communities.

Keywords: big data, migration, cross-disciplinary research, policy, humanitarian, environment, displacement, climate change, health, data security

How to Cite: Franklinos, L. H. V., Parrish, R., Burns, R., Caflisch, A., Mallick, B., Rahman, T., Routsis, V., López, A. S., Tatem, A. J., & Trigwell, R. (2021). Key opportunities and challenges for the use of big data in migration research and policy. UCL Open Environment, 3. https://doi.org/10.14324/111.444/ucloe.000027

Rights: © 2021 The Authors.

2168 Views

2Citations

Published on
27 Oct 2021
Peer Reviewed

Introduction

With the number of global refugees reaching the highest levels since the Second World War [1] and one billion migrants recorded in 2018 alone [2], human migration is high on the global political agenda. The University of London (UCL)-Lancet Commission on Migration and Health [2] and the United Nations Global Compact on Migration [3] have called for improved data to understand drivers of migration, target interventions and support evidence-based migration policy. The application of big data in migration research and policymaking has been proposed as a possible solution to help address these knowledge gaps [2,4]. Big data refers to large, complex data from varied sources, ranging from social media and mobile phone data (Fig. 1), to electronic health records and satellite data, and has the potential to provide new sources of information for migration research [6,7]. Previous studies have used big data to predict patterns of human movement during natural disasters [8] and track movement in near real-time [9], quantify migration at national scales [1014], guide and evaluate humanitarian interventions [15] and examine the effects of human movement on disease transmission [16]. In addition, satellite-based Earth observation data has been used to map the relationship between environmental change and human movement [17,18], model subnational migration flows [19] and inform policy decisions [18,20,21]. Despite the immense opportunity big data can provide for migration research and policy, several challenges have hindered its widespread implementation [2,4].

Figure 1
Figure 1

Key statistical indicators for global big data use. The number of users and portion of the population that has access (penetration) to the Internet, mobile phones, social media and mobile social media. Data were accessed via the Global Digital Report 2019 [5].

In response to the call for increased collaboration [2] and improved research on ways to utilise big data sources in the field of migration [4], we participated in a cross-disciplinary workshop in London, UK on the 3rd July 2019, bringing together UN representatives, humanitarian non-governmental organisations (NGOs), policymakers and academics to facilitate knowledge exchange and identify the key opportunities and challenges for the implementation of big data in migration research. Here, we provide a summary of key discussion points identified in the workshop via presentations, panel discussions and break-out groups in which participants explored different topics and possible solutions. We provide major conclusions from the workshop supported by a review of the relevant literature to assist migration experts in deciding whether the use of big data is appropriate for their work and to stimulate discussion about the potential of this approach in aiding migration research and policy, and the needs of migrant populations globally. In particular, the outcomes of this workshop may provide a timely resource for the recently launched Lancet Migration, a global collaboration of migration experts that aims to address evidence gaps and drive policy change in the field [22]. Importantly, this workshop also tested a range of methods of collaborative working and identified challenges and opportunities in achieving a truly cross-disciplinary approach to migration research, an important but often neglected aspect in such a complex and politically charged topic as migration.

Opportunities and challenges relating to big data

The aims of the workshop were to facilitate discussion on both the opportunities of big data but also the challenges, shortfalls and current ethical considerations surrounding big data and its uses. Whilst it is beyond the scope of a single workshop or article to find solutions to these major challenges, the workshop proved a valuable forum for raising awareness and nuanced discussion around these challenges which are often not fully recognised or addressed by researchers and practitioners alike when collecting or using big data as it relates to migration.

The format of the workshop consisted of keynote speeches, three panel discussions on the topics of 1) the role of big data in understanding the environmental drivers of migration, 2) the opportunities and challenges of big data in healthcare, and 3) the ethical considerations surrounding the use of big data in migration research and policy. The panel discussions were followed by group exercises which aimed to conceive structured, actionable solutions to a set of questions raised during the panel discussions. The resultant discussions and ideas were grouped into six topics which we summarise here and in Table 1.

Table 1

Key questions, challenges, opportunities and solutions for the use of big data in migration research and policy

Topics Key research and policy questions Potential challenges Potential opportunities and solutions
Accessing and utilising big data
  • How can we improve access to big data sources?

  • How can we enhance awareness of available data?

  • How can we develop the expertise required to use big data across disciplines?

  • Issues of ownership and costs in accessing big data are significant barriers.

  • If multiple mobile network operators are operating in a country, multiple data sharing agreements would be required [23].

  • Geographical biases in the type of data available.

  • Poor awareness of fragmented data sources.

  • Issues around the potential extraction of sensitive information contained in big data, therefore access arguably should be difficult for these data.

  • Opportunities for the development of centralised repositories of data such as The Humanitarian Data Exchange [24] to promote collaboration and knowledge-sharing.

  • Opportunities for partnerships across different sectors to improve access to available data and technologies.

  • Capacity building projects across disciplines such as the UNECE Big Data Sandbox [25] are needed to support big data use for migration research.

Integrating data sources and knowledge
  • How can we produce more detailed and recent migration statistics using big data?

  • How can we best integrate data from different sources?

  • How do we manage fragmented data sources across varied spatial and temporal scales?

  • How can we develop a collaborative cross-disciplinary approach to address the challenges in the field of migration?

  • Complex analyses are required to account for multiple biases in different datasets.

  • Lack of awareness and big data expertise in humanitarian sector has led to its slow adoption.

  • The use of big data sources combined with traditional survey-based research can reveal important aspects of migration that are often not captured (e.g., CDRs reveal short-term migration patterns) [26].

  • Potential for methodological innovation for the integration of different data and the development of a ‘gold standard’ to estimate migration using near-real-time big data.

  • Opportunities for cross-disciplinary ‘Data Collaboratives’ [27] to enable data exchange and help address complex problems.

Understanding environmental drivers of migration
  • How can big data be used to assess the ongoing impact of climate change on migration?

  • Can big data help to identify populations that are vulnerable to environmental change?

  • How can big data be used to predict mass migration events due to environmental change?

  • Satellite-based environmental data is often at coarse geographic scales which are not suitable to inform sub-national policies and actions.

  • There may be discrepancies between the assumed environmental drivers of migration perceived from satellite data analyses and the self-reported drivers of migration.

  • Opportunity to address the lack of evidence on the relationship between migration and environmental change and to help to define the term ‘environmental migrant’ that is needed for policy action.

  • Potential to reveal important aspects of migration associated with extreme weather events that are often not captured with traditional data.

  • Potential to unpick socioeconomic factors that may be limiting people’s ability to migrate as an adaptation strategy to environmental change.

  • Opportunities to inform policies that will improve resilience and support migration associated with environmental change.

Improving healthcare access for migrant populations
  • How can big data be used to address the immediate health needs of displaced persons in camps?

  • How can big data help us learn more about undocumented migrants, their health and healthcare needs?

  • How can big data be used to implement evidence-based health interventions?

  • Big data approaches in healthcare risk jeopardise the protection from state surveillance that is currently provided to vulnerable communities such as undocumented migrants.

  • Big data can help to understand the impact of migration on the transmission of infectious diseases.

  • Potential to support on-the-ground activities, helping to address the immediate health needs of displaced persons and to predict disease outbreaks.

  • Potential for big data to improve equitable healthcare access for underrepresented communities, such as undocumented migrants and vulnerable groups.

  • Opportunities to improve the specific healthcare needs of migrants with settled status.

  • Applications for big data in identifying differences in patient responses to treatments and tailoring healthcare to the specific needs of individuals.

Ethical, privacy and security concerns
  • What is meant by ethics in the context of big data in migration research?

  • Who benefits from the use of big data (migrants at individual or community level, academic community, policymakers)?

  • How do power imbalances influence the use of big data?

  • How can we achieve ethical data usage?

  • Big data analytics may lead to the introduction of stakeholders which have varied motivations.

  • Biases in big data may follow through into policies, propagating stereotypes and discriminatory practices, or else continuing to underserve invisible groups.

  • There is often focus on the legal aspects of data protection for personal data rather than the potential negative impacts on affected vulnerable groups.

  • Big data use in migration can promote the power imbalance between those seeking data and those the data is being extracted from.

  • There is no legal enforcement of guidelines for safe and ethical data management in humanitarian situations.

  • Ethical considerations and safeguarding practices are required when involving varied stakeholders in big data analytics

  • The use of big data in migrant research requires the clear stating of assumptions and methods to correct for biases to prevent the propagation of biases in policy.

  • Ethical considerations and safeguarding practices are required when involving varied stakeholders in big data analytics.

  • Researchers and decision-makers must critically examine and justify why they require the use of big data and whether this is what all parties, particularly migrants, would want.

Addressing political narratives
  • How can we prevent the use of big data for the discrimination of certain populations?

  • What role could big data have in addressing the negative political narratives around migration?

  • Big data use may lead to lack of nuance behind the motivations for migration, potentially distorting the narrative behind migration patterns.

  • Ongoing anti-immigration rhetoric means it is imperative that big data is not used to further discriminate against migrant communities or to target certain populations.

  • Opportunity to help to address negative political narratives and to support inclusive and fair migration governance as seen with the Sentinel project, which works to counter the spread of misinformation and antimigrant rhetoric [28].

  • Designing strong, cross-disciplinary communication tactics to support maximum impact of evidence.

  • Opportunity to help to challenge public and media perceptions of global migration which has been shaped by misinformation due to paucity in migration data.

Accessing and utilising big data

The first topic focused on the access, awareness and expertise required for big data use. The application of big data is often hindered by the fact that many big data sources such as mobile phones, Internet-based platforms and other digital devices are managed by private companies who collect the data for business purposes. Therefore, costs associated with accessing big data and issues of ownership are significant barriers to its use [29]. Big data generation will vary geographically and may be reduced in many high mobility contexts where infrastructure (i.e., cell towers, Wi-Fi connection and electronic bank transfer services) is less established. Specific to the use of mobile phone call detail record (CDR) data, the presence of multiple mobile network operators in a country means that multiple data sharing agreements would be required [23]. In addition, there are significant issues around the potential extraction of sensitive information contained in big data [30,31] and data sources are often fragmentated across disciplines which reduces the awareness of available datasets [32]. Accounting for multiple biases and the complex analyses required to interpret the data are further examples of methodological difficulties associated with the use of big data [7,33].

Workshop discussions highlighted the importance of understanding how, why and when the data were collected to identify potential gaps and biases, therefore ensuring it can be used effectively. There is great need for more centralised repositories of data, projects and publications such as The Humanitarian Data Exchange [24], to promote knowledge-sharing, collaborations and inform evidence-based programming. Increased partnerships between governments, international agencies, civil society and the private sector are also required to improve data access and ensure the optimum exploitation of available data and technologies. Furthermore, capacity building in countries or organisations with an interest in big data analysis is needed to support cross-disciplinary research and improve specialist knowledge in certain regions. This could be achieved via collaborations with relevant partners and agencies such as has been demonstrated with the United Nations Economic Commission for Europe’s (UNECE) Big Data Sandbox, which provided a platform for statistical organisations to collaborate and learn to use big data analytics [25]. However, there may be ethical considerations for private–public partnerships. For example, published commentaries have voiced fears over the partnership between the UN’s World Food Programme and the data analytics company Palantir, which may have serious consequences for the privacy and security of aid recipients due to the company’s links to United States (US) intelligence agencies [34].

Integrating data sources and knowledge

The second topic concerned the integration of data and knowledge across disciplines. The main source of data for migration statistics originates from traditional methods such as household surveys recorded using local scales and national population estimates, as well as data on forced displacement collected through key informant networks [4] (Fig. 2). Big data sources have the potential to complement traditional data and address significant spatial and temporal gaps via updating migration statistics in an accurate and low-cost way [4,11]. For example, analysis of CDR data can be used to replicate national internal migration statistics and complement outputs from censuses [11]. However, integrating migration data from traditional methods with varied sources of big data requires new methodology that considers complex interactions over differing geographical and temporal scales. The quality of various data sets (e.g., demographic biases present in social media datasets) remains an unresolved challenge in teasing out comprehensive, policy-relevant results. Validating estimated migration using near-real-time big data is also problematic, with no trusted ‘gold standard’ currently available [37]. The slow adoption of big data analyses in the humanitarian sector is partly due to a lack of expertise in how to apply these approaches in operational settings [38]. Workshop participants discussed the need to bridge the gap between experts on the ground collecting the data via traditional methods and big data analysts via increased transdisciplinary training and collaborations. A recent workshop hosted by the International Organization for Migration (IOM) and the German Federal Foreign Office concluded that ‘greater cooperation and engagement among stakeholders’ both within and external to the migration sector are required to inform decision making [39]. If we are to integrate different data sources effectively, a collaborative cross-disciplinary approach is required to ensure we understand the data and how they can be used to deepen our understanding of the drivers and impacts of migration. This approach is practiced in ‘Data Collaboratives’; collaborative projects in which different sectors including private companies, research institutions and government agencies collaborate to enable data exchange and help solve public problems [27]. NetHope is an example of a data collaborative project which has helped to integrate data sources and produce maps of connectivity sites across Puerto Rico to assist in delivering aid in the aftermath of Hurricane Maria [40].

Figure 2
Figure 2

Global human migration by subregion. A) Number of internally displaced persons (from conflicts and disasters) in 2020 ranging from 22,000 in Eastern Europe (dark blue) to 14,200,000 in Western Asia (yellow); data were accessed via the Internal Displacement Monitoring Centre’s Global Internal Displacement database [35]. B) Number of international migrants in 2019 ranging from 1,542,000 in the Caribbean (dark blue) to 58,647,000 in North America (yellow); data were accessed via the UN Department of Economic and Social Affairs, Population Division database [36]. The legends for both graphs show the number of migrants in thousands.

Understanding environmental drivers of migration

The third topic considered the use of big data in understanding and identifying environmental drivers of migration such as natural hazards and climate change [41], for example, via remote sensing. Currently, there is no internationally agreed definition for ‘environmental migrant’ despite it being required to collect long-term data and guide the policies of governments and international agencies. The IOM have proposed a broad working definition [42] which importantly considers that environmental migration might be triggered both by sudden-onset events [43], such as earthquakes and cyclones, and slower environmental change processes, such as desertification and sea-level rise. In the context of slow-onset events and gradual environmental change, the effects on migration are often difficult to quantify as it can be hidden behind more immediate socioeconomic drivers such as poverty or political processes [44]. There are opportunities for the application of big data to unpick these different drivers and to help to define the term ‘environmental migrant’. A further application could be in helping to quantify the effect of environmental change on the trapping of individuals or communities, usually due to rising poverty barriers which impede mobility. Human mobility can improve resilience and attenuate the negative outcomes of environmental degradation, but poverty, disability and social exclusion may limit people’s ability to resort to migration as an adaptation strategy [45].

Despite the significant attention that environment-induced human migration has received [46], evidence on the relationship between migration and environmental change is limited. When used in combination with traditional datasets, big data has the potential to identify hot spots of environmental change and exposed populations that may be affected by the change and therefore liable to migrate, or to become trapped. Satellite data is a particularly valuable resource in analysis of environmental drivers as it enables the systematic, consistent and accurate monitoring of areas (even if remote or inaccessible) that are affected by anthropic or natural hazards. Indeed, satellite-based technologies are key to analysing observable effects of climate change effects and even predicting environmental-led migration [4752]. Furthermore, a study combined satellite data and CDR data to quantify the incidence, direction and duration of flood-driven migration, revealing important short-term (hours–weeks) aspects of migration associated with extreme weather events that are often not captured with traditional survey-based research [18]. There is great potential for big data to help in understanding and addressing environment-related displacement and to inform policies that will improve resilience to environmental change and support migration that is required to improve the health and livelihoods of vulnerable people. In such examples, improved data on both environmental change (as a driver of migration) and of migration itself (such as displacement following a natural hazard) cannot ‘solve the problem’ of forced environmental migration but can inform interventions such as aid as well as discussions between affected communities and stakeholders in devising context-appropriate solutions for the future.

One of the most valuable aspects of satellite-based analyses is the capability for retrospective analysis which is required to detect changing patterns across space and time and to inform predictions. However, a recent review stated that current initiatives do not exploit the full possibilities of satellite-based earth observation in migration with a lack of services offering the systematic flow of detailed information to researchers, managers and migration analysts [53]. One of the main gaps identified is that consolidated satellite-based monitoring systems currently work at regional scales, providing a geographic resolution that is often too coarse to understand the specificities of how particular communities are affected. These datasets are thus unable to reliably inform the design, implementation and monitoring of sub-national policies. Indeed, reported discrepancies between the assumed environmental drivers of migration perceived from satellite data analyses and the self-reported drivers of migration (e.g., see [54]) underline the importance of including information on the lived experiences of migrants to inform actions.

Improving healthcare access for migrant populations

The fourth area of discussion focused on the potential for big data to enhance migrant health via improved disease outbreak preparedness, identification of vulnerable groups, increased access to healthcare and by informing evidence-based health interventions. Many studies have demonstrated the potential for big data to understand the impact of migration on the transmission of infectious diseases such as dengue [16], malaria [19,55] and cholera [56] at a national scale. These analyses highlight the potential benefit of big data use to improve preparedness and mitigation efforts for disease outbreaks. This may be particularly useful when supporting on-the-ground activities by helping to predict potential disease outbreaks for displaced persons. A recent collaboration between the IOM and the mobile operator data analytics organisation Flowminder, is combining IOM Flow Monitoring Registry surveys with CDR data to gather anonymous information about people on the move at key transit points to inform public health interventions for the coronavirus (COVID-19) pandemic [26]. A further application of big data in humanitarian settings is the use of satellite data to map refugee settlements [21], which can be used to ensure healthcare access for displaced persons.

A perennial issue in equitable healthcare access is identifying and addressing the needs of invisible communities, such as migrants, particularly those with an undocumented status [2]. Shortfalls with traditional datasets and data collection processes such as semi-structured interviews and surveys means there is limited information on undocumented migrants and vulnerable groups (i.e., unaccompanied children, people with disabilities and members of the lesbian, gay, bisexual, transgender and intersex (LGBTI) community). There is great potential for big data to address the paucity of information on these groups, their health, access to healthcare and differing healthcare needs. In addition, it was suggested that the healthcare needs of migrants settled in countries such as the UK could also be improved by big data analysis, for example, via a general migrant longitudinal study such as the cohort studies performed by the UK Economic and Social Research Council [57]. There are also vast applications for big data in implementing evidence-based health interventions that need to be explored, specifically in identifying differences in patient responses to treatments and tailoring healthcare to the specific needs of individuals [58].

The COVID-19 pandemic has received widespread support for the use of big data in disease surveillance systems globally [59]. Such approaches have had various levels of success at curbing infection rates, particularly crucial for vulnerable persons (which could include some migrant communities). However, if the use of such big data approaches in healthcare became common place, it may cost many the protection that invisibility currently offers, with many communities such as undocumented migrants fearing disproportionate effects of state surveillance [60]. Such ethical conundrums remain unresolved and often underexplored. At best, it is likely that big data should not be seen as a perfect solution but as one optional tool within a wider social toolkit which retains traditional and non-digital interventions.

Ethical, privacy and security concerns

The fifth topic focused on ethical, privacy and security concerns regarding the use of big data in migration research. This topic proved to be cross-cutting, with themes re-emerging across other topics. Principally, a recurrent question arose about whether it is appropriate to collect and/or use sensitive data in the pursuit of greater understanding for researchers and policymakers. The collection of personal data including migrant status is a contentious issue. There are concerns that information on personal migration status may create or increase existing discriminatory practices in society such as the provision of healthcare and access to state funds, or that mobile tracking devices may be used against a migrant to forcibly return them to a previous location [2]. Additionally, various sources of big data (such as social media) are consumed differently based on geography, demography and access. As such, any analysis of these datasets will carry these biases, which may follow through into policies. This may result in policies which propagate stereotypes and discriminatory practices, or else continue to underserve invisible groups (e.g., those not engaged with social media or with smaller social networks) [37].

Ethics in the context of big data in migration may be considered in several ways. Firstly, it may relate to the way in which the research is conducted and whether there has been consideration for data privacy and security. Secondly, it could refer to decision-making regarding migrants with consideration for their lived experience, especially in humanitarian situations. A recent report on migration noted that discussions on ethics often focus on the legal aspects of data protection rather than understanding how the results of analyses may detrimentally impact affected populations and counter the humanitarian principles to ‘do no harm’ [39]. Furthermore, data protection measures are often focused on personal data (e.g., General Data Protection Regulation in the European Union) and do not consider group data protection needed to work with vulnerable groups [61].

It is important to consider who benefits from the use of big data sources in migration research. At the individual level, migrants may not wish for additional data to be gathered about them and may perceive no benefits of the process [62]. However, at the community level, such data and analysis may help to address the health needs of migrants more generally. Certainly, there will be benefits to the academic community seeking to study migrant health needs and to decision-makers seeking evidence-based solutions. Pursuit of these research and policy goals can result in overlooking the individual rights and raise ethical issues for many vulnerable people [62]. Furthermore, forcibly displaced people fleeing persecution may have little trust in authorities and therefore be less willing to seek healthcare or consent to having their data collected. This creates a barrier for healthcare professionals, humanitarian workers and researchers who wish to respect the rights of the individual, whilst deriving a better understanding of migration pathways and healthcare needs. Workshop discussions highlighted the power imbalance between various parties involved; those seeking data including governments and academics often from the global North with inherent biases and power, and those the data is being sought from who are often vulnerable persons in precarious or dangerous situations, many originating from the global South [63]. Evidence of this power imbalance is seen with ‘high-risk experiments’ with new technologies such as the use of Canada’s automated decision-making technology in immigration and refugee applications [64] which often lack regulation. Even after applying advanced safeguarding practices and aggregated outputs, researchers may still be reluctant to apply big data analysis for migration research as policy makers often have their own agendas and may use the methods and deliverables in ways not intended or anticipated by the research authors.

Another established concern regarding the ethical deployment of big data in migration research is the focus on pattern-based analyses, rather than on achieving a conceptual interpretation and critical analysis for why behavioural trends might emerge and under what set of circumstances and assumptions [65,66]. Indeed, the limitations and bias of big data remains an under-explored aspect in the discourse around ethical implications of big data. The COVID-19 pandemic provides a pertinent example of how the lack of data from vulnerable communities such as refugees or people on the move has led to the underrepresentation of these communities in the narrative and political responses of the pandemic [67].

Addressing political narratives

The final topic concerned the role of big data in high level political narratives around migration. With ongoing antimigrant rhetoric existing at all levels of government and society, migrant research has the opportunity and mandate to address such political narratives. There are many examples of authorities treating migrants as political pawns or as statistical figures [68]. Therefore, it is imperative that big data is not used to further discriminate against migrant communities or to target certain populations, but rather to support inclusive and fair migration governance. This can be particularly problematic with big datasets, where assumptions have been made throughout the data collection and analysis process. For example, in 2017 the UK data analytics company CGI together with the Dutch statistical agency CBS, conducted a study into migration forecasting of people in Syria using Twitter data [66]. The study required a set series of assumptions to be made about Twitter content (more specifically its English translation) and was unable to consider the context within which tweets were posted. This case study demonstrates how the promotion of big data can reify such assumptions and distort the original meaning or intent behind an individual’s migration decisioning, as well as pervert the aggregate narrative behind migration patterns [66]. Conversely, a positive example discussed within the workshop was the Sentinel project; an NGO that works to gather and disseminate ‘trusted’ information to local people and governments in order to counter the spread of misinformation, antimigrant rhetoric and to prevent resultant hate crime and genocides [28]. Participants also deliberated whether the increased evidence provided by big data would be instructive and powerful enough to overcome political and social biases associated with the topic. Given the highly political nature and high stakes of migration policy for migrants as well as for governments and the international community, more evidence may alone be insufficient to achieve multilateral, progressive action. Therefore, it is worth considering other factors contributing to the political discourse and designing strong, cross-disciplinary communication tactics to support maximum impact of evidence. Furthermore, it is worth considering how paucity in migration data has helped to shape public and media perceptions of global migration patterns to date, and whether big data could be used to address these perceptions.

Discussion

The application of big data in migrant research shows much promise in addressing the current gaps in knowledge. Big data sources can help to update internal migration statistics by addressing the significant gaps in quantity and quality of data collected from traditional methods [4,11]. When combined with field-level data derived from household surveys and key-informant networks, big data can be used to detect how sudden onset natural hazards and gradual environmental change (e.g., desertification and sea-level rise associated with climate change) impact migration patterns. This can help to inform planning and scenario building, as well as contributing to a more comprehensive definition of ‘environmental migrant’ which is critical for migration policy within the context of ongoing environmental change. In addition, it has a potential application in considering the differing healthcare needs of migrants as well as identifying vulnerable populations unable to migrate due to environmental change. There is also scope for big data to inform evidence-based health interventions for migrant populations in everyday and emergency (including displacement) settings. Yet despite the vast opportunities that big data present, there are some important areas to consider before using these varied and complex data sources. Increased cross-disciplinary partnerships are required to improve data access, knowledge-sharing and capacity building across sectors and regions. In addition, a collaborative cross-disciplinary approach is required to ensure the different datasets are understood and to develop new methodologies to integrate data sources and identify complex interactions that influence how and why people are moving. Furthermore, the reported lack of agreement within the humanitarian sector on how migration modelling should be applied needs to be addressed so analyses can be effective [39]. It is important to challenge the assumption that big data is always a suitable and insightful tool to use in research for migration policy. Whilst tempting to consider big data the magic bullet, it may not always be appropriate or may need to be used in conjunction with traditional methods.

International legislation is required to sufficiently address how migrant data should be collected and used to ensure ethical conduct by data gatherers and owners and the safeguarding of human rights, particularly in sensitive migration contexts [69]. The United Nations Development Group [69] and Office for the Coordination of Humanitarian Affairs [70] provide guidelines for safe and ethical data management in humanitarian situations; however, there is no legal enforcement of these practices. Although researchers would like quicker and easier access to data, workshop discussions challenged whether the process should be hastened, suggesting that administrative obstacles force researchers to duly consider whether additional data is necessary and beneficial to the current state of knowledge, given the risks and trade-offs that must be made. A key output of the workshop was a consensus that researchers and decision-makers must first ask why they require additional data and whether this is what all parties, particularly migrants, would want. As well as data protection issues, it is also imperative to understand the potential harmful impact of analyses on vulnerable migrant populations [39]. It is especially important that big data is not used to further discriminate or target migrant populations considering current antimigrant political narratives.

In pursuit of cross-disciplinary collaboration, the workshop brought together a range of representatives from the UN, government, humanitarian agents and academics from a range of backgrounds. Cross-sector engagement in the workshop was difficult to achieve, which may be due to differences in the objectives of different sectors, as well as the language of engagement used. For instance, humanitarian organisations were particularly difficult to engage, and it is thought that this was due to both a shortage of networks linking academia and humanitarian organisations as well as differences in short-, medium- and long-term needs and objectives of the two sectors. We trialed different methods to stimulate interdisciplinary work including the use of business canvases [71] to explore and present solutions to questions. This approach was useful for stimulating debate within the groups and producing well-considered outputs. However, future interdisciplinary events would benefit from the development of methods that consider the language styles and information sharing techniques of different disciplines and thus facilitate effective communication and knowledge-sharing [4,39,72,73]. The workshop also elucidated the extent to which the direction of conversation is steered by which actors are engaging in the conversation. For example, actors engaging with migrants directly (i.e., service providers) focus on practical implications. This serves as a pertinent reminder of the importance for academics to create and utilise diverse networks as well as the need for actors with gatekeeping power to exercise due diligence and engage a wide range of stakeholders and interest groups in discussions. Overall, the workshop highlighted the benefits of cross-disciplinary work, enabling the identification of key topics from a variety of angles and providing meaningful and effective outputs. Furthermore, we hope this workshop assists in cultivating a future transdisciplinary approach to migration research, whereby there is a move beyond the collaboration of individual disciplinary perspectives to develop curriculum integration that organises knowledge production in the context of real-world problems [74].

Declarations and conflicts of interest

Research ethics statement

Not applicable.

Consent for publication statement

The author declares that research participants’ informed consent to publication of findings – including photos, videos and any personal or identifiable information – was secured prior to publication.

Conflicts of interest statement

The authors declare no conflict of interest with this work.

Open data and materials availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Funding and acknowledgements

The workshop was supported by a UCL Grand Challenges grant (506002-100-156425). We thank Leroy Clement, Ilan Kelman, Bayes Ahmed, Chris Langridge, Yasna Palmeiro-Silva, Laura Busert, Maheswar Satpathy, Emeline Rougeaux and Alicia Matthews for their help in the organisation of the workshop.

Author contributions

LF and RP led the workshop and took the lead role in writing the report. All authors participated in discussion sessions at the workshop and contributed to the report. Authors 3–10 are listed alphabetically.

References

[1]  UNHCR. Global Trends: forced displacements in 2015, Geneva, Switzerland:

[2]  Abubakar, I; Aldridge, RW; Devakumar, D; Orcutt, M; Burns, R; Barreto, ML. (2018).  The UCL–Lancet Commission on Migration and Health: the health of a world on the move.  Lancet 392 (10164) : 2606–54.

[3]  UN. Global compact for safe, orderly and regular migration,

[4]  IOM. Data bulletin-big data and migration. Ispra: International Organization for Migration. Available from: https://publications.iom.int/system/files/pdf/issue_5_big_data_and_migration.pdf . Accessed 20 March 2020

[5]  We Are Social, Hootsuite. Global digital report 2019, New York: Available from: https://wearesocial.com/global-digital-report-2019 . Accessed 1 April 2020

[6]  Fleming, L; Depledge, M; Leonelli, S; Gordon-Brown, H; Leonardi, G; Golding, B. (2017).  Big data in environment and human health.  Oxf Res Encycl Environ Sci July : 1–27. Available from: https://oxfordre.com/environmentalscience/view/10.1093/acrefore/9780199389414.001.0001/acrefore-9780199389414-e-541.

[7]  Hay, SI; George, DB; Moyes, CL; Brownstein, JS. (2013).  Big data opportunities for global infectious disease surveillance.  PLoS Med 10 (4) : 2–5.

[8]  Lu, X; Bengtsson, L; Holme, P. (2012).  Predictability of population displacement after the 2010 Haiti earthquake.  Proc Natl Acad Sci 109 (29) : 11576-81.

[9]  Wilson, R; Zu Erbach-Schoenberg, E; Albert, M; Power, D; Tudge, S; Gonzalez, M. (2016).  Rapid and near real-time assessments of population displacement using mobile phone data following disasters: the 2015 Nepal earthquake.  PLOS Curr Dis 8 ecurrents.dis.d073fbece328e4c39087bc086d694b5c.

[10]  Lu, X; Wrathall, DJ; Sundsøy, PR; Nadiruzzaman, M; Wetter, E; Iqbal, A. (2016).  Detecting climate adaptation with mobile network data in Bangladesh: anomalies in communication, mobility and consumption patterns during cyclone Mahasen.  Clim Change 138 : 505–19.

[11]  Lai, S; Zu Erbach-Schoenberg, E; Pezzulo, C; Ruktanonchai, NW; Sorichetta, A; Steele, J. (2019).  Exploring the use of mobile phone data for national migration statistics.  Palgrave Commun 5 (1) : 1–10.

[12]  Ruktanonchai, NW; Ruktanonchai, CW; Floyd, JR; Tatem, AJ. (2018).  Using Google Location History data to quantify fine-scale human mobility.  Int J Health Geogr 17 (28) : 1–13.

[13]  Spyratos, S; Vespe, M; Natale, F; Weber, I; Zagheni, E; Rango, M. (2019).  Quantifying international human mobility patterns using Facebook Network data.  PLoS One 14 (10) e0224134

[14]  Palotti, J; Adler, N; Morales-Guzman, A; Villaveces, J; Sekara, V; Herranz, MG. (2020).  Monitoring of the Venezuelan exodus through Facebook’s advertising platform.  PLoS One 15 (2) e0229175

[15]  Peak, CM; Wesolowski, A; Zu Erbach-Schoenberg, E; Tatem, AJ; Wetter, E; Lu, X. (2018).  Population mobility reductions associated with travel restrictions during the Ebola epidemic in Sierra Leone: use of mobile phone data.  Int J Epidemiol Oct 1 2018 47 (5) : 1562–70.

[16]  Wesolowski, A; Qureshi, T; Boni, MF; Sundsøy, PR; Johansson, MA; Basit, S. (2015).  Impact of human mobility on the emergence of dengue epidemics in Pakistan.  PNAS 112 (38) : 11887–92.

[17]  van der Geest, K; Vrieling, A; Dietz, T. (2010).  Migration and environment in Ghana: a cross-district analysis of human mobility and vegetation dynamics.  Environ Urban 22 (1) : 107–23.

[18]  Lu, X; Wrathall, DJ; Sundsøy, PR; Wetter, E; Iqbal, A; Qureshi, T. (2016).  Unveiling hidden migration and mobility patterns in climate stressed regions: a longitudinal study of six million anonymous mobile phone users in Bangladesh.  Glob Environ Change 38 : 1–7.

[19]  Sorichetta, A; Bird, TJ; Ruktanonchai, NW; Zu Erbach-Schoenberg, E; Pezzulo, C; Tejedor, N. (2016).  Mapping internal connectivity through human migration in malaria endemic countries.  Sci Data Aug 16 2016 3 (1) : 1–16.

[20]  Hauer, ME; Fussell, E; Mueller, V; Burkett, M; Call, M; Abel, K. (2020).  Sea-level rise and human migration.  Nat Rev Earth Environ Jan 2020 1 (1) : 28–39.

[21]  Logar, T; Bullock, J; Nemni, E; Bromley, L; Quinn, JA; Luengo-Oroz, M. (2020).  PulseSatellite: a tool using human-AI feedback loops for satellite image analysis in humanitarian contexts.  Proc AAAI Conf Artificial Intell 34 (09) : 13628–9.

[22]  Orcutt, M; Spiegel, P; Kumar, B; Abubakar, I; Clark, J; Horton, R. (2020).  Lancet migration: global collaboration to advance migration health.  Lancet Feb 1 2020 395 (10221) : 317–9.

[23]  Gerstle, T; Hardimon, M; Hunt, G; Main, I; Paudel, L; Pontón, D. (2021).  Assessing the use of Call Detail Records (CDR) for monitoring mobility and displacement. Princeton, NJ: Princeton University School of Public and International Affairs. Feb 2021 Available from: https://www.migrationdataportal.org/sites/default/files/2021-02/IOM_Princeton_CDRReport_Feb2021_web.pdf . Accessed 9 September 2021

[24]  Welcome – Humanitarian Data Exchange. Available from: https://data.humdata.org/ . Accessed 18 June 2020

[25]  Vale, S. (2016).  Big Data Sandbox. UNECE. Available from: http://www1.unece.org/stat/platform/display/bigdata/Sandbox . Accessed 19 March 2020

[26]  Harrison, J; Harper, M; Gray, J; Lefebvre, V. (2020).  Privacy-Conscious Data Analytics to Support the COVID-19 Response in Haiti. Report 2. Jun 2020 Stockholm, Sweden: Flowminder Foundation. Available from: https://haiti.iom.int/sites/haiti/files/documents_files/Haiti_Digicel_DTM_Slide_Report_June2020_eng.pdf . Accessed 9 September 2021

[27]  Winowatan, M. (2018).  The emergence of data collaboratives. in numbers.  The Governance Lab, Available from: http://thegovlab.org/the-emergence-of-data-collaboratives-in-numbers/ . Accessed 20 March 2020

[28]  The Sentinel Project. The Sentinel Project, Available from: https://thesentinelproject.org/ . Accessed 20 March 2020

[29]  Kirkpatrick, R; Vacarelu, F. (2018).  A decade of leveraging big data for sustainable development.  UN Chronicl 3 Dec 2018 Available from: https://unchronicle.un.org/article/decade-leveraging-big-data-sustainable-development . Accessed 11 February 2020

[30]  von Mörner, M. (2017).  Application of call detail records – chances and obstacles.  Transp Res Procedia 25 : 2233–41.

[31]  de Montjoye, Y-A; Gambs, S; Blondel, V; Canright, G; de Cordes, N; Deletaille, S. (2018).  On the privacy-conscientious use of mobile phone data.  Sci Data Dec 11 2018 5 (1) : 1–6.

[32]  Verhulst, SG; Young, A. (2018).  The potential and practice of data collaboratives for migration.  Stanford Social Innovation Review, Available from: https://ssir.org/articles/entry/the_potential_and_practice_of_data_collaboratives_for_migration . Accessed 19 March 2020

[33]  Wesolowski, A; Eagle, N; Noor, AM; Snow, RW; Buckee, CO. (2013).  The impact of biases in mobile phone ownership on estimates of human mobility.  J R Soc Interface Apr 6 2013 10 (81) 20120986.

[34]  Parker, B. (2019).  New UN deal with data mining firm Palantir raises protection concerns.  The New Humanitarian, Available from: https://www.thenewhumanitarian.org/news/2019/02/05/un-palantir-deal-data-mining-protection-concerns-wfp . Accessed 19 March 2020

[35]  Internal Displacement Monitoring Centre (IDMC). Global Internal Displacement Database Data, Available from: https://www.internal-displacement.org/database/displacement-data . Accessed 25 August 2020

[36]  United Nations, Department of Economic and Social Affairs, Population Division. World population prospects 2019, Online Edition. Rev. 1.

[37]  Sîrbu, A; Andrienko, G; Andrienko, N; Boldrini, C; Conti, M; Giannotti, F. (2021).  Human migration: the big data perspective.  Int J Data Sci Anal 11 : 341–60, Available from:. DOI: http://dx.doi.org/10.1007/s41060-020-00213-5

[38]  Oroz, ML. (2018).  From big data to humanitarian-in-the-loop algorithms.  UNHCR Innovation, Available from: https://www.unhcr.org/innovation/big-data-humanitarian-loop-algorithms/ . Accessed 11 February 2020

[39]  IOM. Workshop report on forecasting human mobility in contexts of crises. Berlin: German Federal Foreign Office (FFO) and the International Organization for Migration (IOM). Oct 2019

[40]  NetHope Blog. Unlocking insights from data: collaboration with private sector creates cutting-edge maps for disaster response – NetHope, Available from: https://nethope.org/2018/09/10/unlocking-insights-from-data-collaboration-with-private-sector-creates-cutting-edge-maps-for-disaster-response/ . Accessed 20 Mar 2020

[41]  Martin, S; Ferris, E; Kumari, K; Bergmann, J. (2018).  The global compacts and environmental drivers of migration, KNOMAD. Report No.: Policy Brief 11.

[42]  IOM. International Migration Law: glossary on migration. Geneva: International Organization for Migration. Available from: https://www.iom.int/glossary-migration-2019 . Accessed 30 March 2020

[43]  Disaster Displacement. Key Definitions. Platform on Disaster Displacement, Available from: https://disasterdisplacement.org/the-platform/key-definitions . Accessed 30 March 2020

[44]  Ionesco, D; Mokhnacheva, D; Gemenne, F. (2017).  The atlas of environmental migration. 1st ed Oxon, UK: Routledge.

[45]  Oakes, R; Banerjee, S; Warner, K. (2019). Chapter 9: Human mobility and adaptation to environmental change In:  World Migration Report 2020. Geneva: United Nations University Institute for Environment and Human Security, International Organization for Migration and UNFCCC Secretariat.

[46]  Maystadt, J; Mueller, V. (2012).  Environmental migrants: a myth? International Food. Policy Research Institute. Research Brief 18. Available from: https://www.ifpri.org/publication/environmental-migrants . Accessed 20 August 2021

[47]  Chen, J; Mueller, V. (2018).  Coastal climate change, soil salinity and human migration in Bangladesh.  Nature Clim Change 8 : 981–5.

[48]  Chen, J; Mueller, V; Jia, Y; Kuo-Hsin Tseng, S. (2017).  Validating migration responses to flooding using satellite and vital registration data.  Amer Econ Rev 107 (5) : 441–5.

[49]  Liu, Zhen; Balk, D. (2020).  Urbanisation and differential vulnerability to coastal flooding among migrants and nonmigrants in Bangladesh.  Popul Space Place 26 e2334

[50]  Mueller, V; Gray, C; Kosec, K. (2014 Mar).  Heat stress increases long-term human migration in rural Pakistan.  Nat Clim Change 4 (3) : 182–5.

[51]  Davis, KF; Bhattachan, A; D’Odorico, P; Suweis, S. (2018).  A universal model for predicting human migration under climate change: examining future sea level rise in Bangladesh.  Environ Res Lett Jun 2018 13 (6) : 064030.

[52]  DTM. South Sudan: seasonal floods analysis. International Organization for Migration, Oct 2019 Available from: https://displacement.iom.int/reports/south-sudan-%E2%80%94-seasonal-flooding-maps-november-2019 . Accessed 30 March 2020

[53]  Lang, S; Füreder, P; Riedler, B; Wendt, L; Braun, A; Tiede, D. (2019).  Earth observation tools and services to increase the effectiveness of humanitarian assistance.  Eur J Remote Sens Oct 30 2019 0 (0) : 1–19.

[54]  Hoffmann, EM; Konerding, V; Nautiyal, S; Buerkert, A. (2019).  Is the push-pull paradigm useful to explain rural–urban migration? A case study in Uttarakhand, India.  PLoS One 14 (4) e0214511

[55]  Wesolowski, A; Eagle, N; Tatem, AJ; Smith, DL; Noor, AM; Snow, RW. (2012).  Quantifying the impact of human mobility on malaria.  Science 338 (6104) : 267–70.

[56]  Bengtsson, L; Gaudart, J; Lu, X; Moore, S; Wetter, E; Sallah, K. (2015).  Using mobile phone data to predict the spatial spread of cholera.  Sci Rep 5 : 8923.

[57]  ESRC. Centre for Longitudinal Studies. UKRI – Economic and Social Research Council, Available from: https://esrc.ukri.org/research/our-research/centre-for-longitudinal-studies/ . Accessed 17 March 2020

[58]  Pastorino, R; De Vito, C; Migliara, G; Glocker, K; Binenbaum, I; Ricciardi, W. (2019).  Benefits and challenges of Big Data in healthcare: an overview of the European initiatives.  Eur J Public Health 29 (Suppl_3) : 23–7.

[59]  Zwitter, A; Gstrein, OJ. (2020).  Big data, privacy and COVID-19 – learning from humanitarian expertise in data protection.  Int J Humanitarian Action 5 (1) : 4.

[60]  Pelizza, A; Lausberg, Y; Milan, S. (2020).  The dilemma of making migrants visible to COVID-19 counting.  Processing Citizenship Blog, April 28 2020 Available from: https://processingcitizenship.eu/the-dilemma-of-making-migrants-visible-to-covid-19-counting/ . Accessed 20 August 2020

[61]  World Economic Forum. Civil society in the fourth industrial revolution: preparation and response, Cologny/Geneva:

[62]  Lamber, R; Pinter, K; Aigner, A; Reiterer, M; Kappel, K; Grechenig, T. (2019). Ethical issues arising through identification and registration systems applied in a European refugee camp In:  2019 9th International Conference on Advanced Computer Information Technologies (ACIT). : 320–4.

[63]  Nawyn, SJ. (2016).  Migration in the Global South: exploring new theoretical territory.  Int J Sociol 46 (2) : 81–4.

[64]  Molnar, P. (2020). AI and Migration Management In:  The Oxford handbook of ethics of AI, Dubber, MD, Pasquale, F; F and Das, S S (eds.),   Oxford handbooks online. Available from:. DOI: http://dx.doi.org/10.1093/oxfordhb/9780190067397.013.49

[65]  Halford, S; Savage, M. (2017).  Speaking sociologically with big data: symphonic social science and the future for big data research.  Sociology 51 (6) : 1132–48.

[66]  Taylor, L; Meissner, F. (2020).  A crisis of opportunity: market-making, big data, and the consolidation of migration as risk.  Antipode 52 (1) : 270–90.

[67]  Milan, S; Treré, E. (2020).  The rise of the data poor: the COVID-19 pandemic seen from the margins.  Soc Media Soc Aug 11 2020 6 (3) 2056305120948233.

[68]  Arcimaviciene, L; Bag˘lama, SH. (2018).  Migration, metaphor and myth in media representations: the ideological dichotomy of ‘them’ and ‘us’.  SAGE Open 8 : 1–13.

[69]  UNDG. Data privacy, ethics and protection: guidance note on big data for achievement of the 2030 Agenda. United Nations Development Group.

[70]  The Centre for Humanitarian Data. Data responsibility guidelines,

[71]  Osterwalder, A; Pigneur, Y. (2009).  Business model generation. Amsterdam: Self Published.

[72]  Gooch, D; Vasalou, A; Benton, L. (2017).  Impact in interdisciplinary and cross-sector research: Opportunities and challenges.  J Assoc Inf Sci Technol 68 (2) : 378–91.

[73]  Vogel, KM; Tyler, BB. (2019).  Interdisciplinary, cross-sector collaboration in the US Intelligence Community: lessons learned from past and present efforts.  Intell Natl Secur 34 (6) : 851–80.

[74]  Choi, BCK; Pak, AWP. (2006).  Multidisciplinarity, interdisciplinarity and transdisciplinarity in health research, services, education and policy: 1. Definitions, objectives, and evidence of effectiveness.  Clin Investig Med Med Clin Exp 29 (6) : 351–64.

 Open peer review from Frances Darlington-Pollock

Review

Review information

DOI:: 10.14293/S2199-1006.1.SOR-COMPSCI.A4NSIB.v1.RSQENK
License:
This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

Keywords: migration , humanitarian , Public policymaking , Environmental modelling , Statistics , cross-disciplinary research , health , climate change , environment , data security , Environmental justice and inequality/inequity , displacement , policy , Health and climate change , Big data

Review text

It is not new to question how we can better understand who moves where and why. Yet the synthesised debates drawn from a cross-disciplinary workshop on the implementation of big data in migration research do act as a platform for new work in this area.  The explosion of big data and wider developments in computer science mean, increasingly, policy makers, researchers and businesses are redirecting research and action towards the opportunities afford by such data. While it is easy to be swayed by the promise of the sort of temporal and spatial granularity big data can offer, in practice there are significant challenges. This paper clearly summarises these challenges, simultaneously touching on the opportunities that may arise if the challenges are addressed. The paper’s impact might be strengthened by more depth on the latter – though this by no means undermines its value - it is a good discussion of how to create and develop future research agendas on migration with big data.

In more detail, Table 1 usefully summarises the topics explored, giving hints to the relevant questions within. But a review of this table in relation to the discussion offered does leave space for more. Though the authors do set out to highlight opportunities for big data and migration research, the discussion would benefit from more concrete examples of how. In particular, this would strengthen the paper if seeking to sway those in the research community who may be less receptive to the potential of big data. The section on healthcare is one example where this could be addressed given its brevity: having finished that section I was not sure precisely how big data would improve migrants access to healthcare. The paper does make a strong argument as to the potential of big data in opportunities to identify vulnerable populations through immobility or non-migrations. Yet there is still room to expand and better articulate what this evidence would do, particularly when combined with more targeted research into the lived experience of such groups.

It is welcome to see the final two topics addressing debates on ethical, privacy and security concerns, and then the role of big data in addressing political narratives. These are closely interlinked and it is perhaps in this space where more research, discussion and innovation need to emerge before big data can truly impact on migration research. The paper evidences some interesting discussion but again, could do more to outline in what ways the significant challenges identified can be overcome.

The paper does leave you wanting more, asking ‘how’ to realise the opportunities suggested. But this is perhaps beyond the scope of a commentary on a workshop discussion. In fact, the authors insights into how to run future cross-disciplinary workshops which better account for the theoretical, language and knowledge-sharing differences between disciplines are perhaps the next steps in the progression of the ideas summarised in this paper. It is here where cross-disciplinary discussion will work out how, and the authors could have made more of this.



Note:
This review refers to round of peer review and may pertain to an earlier version of the document.

 Open peer review from Fran Meissner

Review

Review information

DOI:: 10.14293/S2199-1006.1.SOR-COMPSCI.AAICCM.v1.RNRPQL
License:
This work has been published open access under Creative Commons Attribution License CC BY 4.0 , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com .

Keywords: migration , humanitarian , Public policymaking , Environmental modelling , Statistics , cross-disciplinary research , health , climate change , environment , data security , Environmental justice and inequality/inequity , displacement , policy , Health and climate change , Big data

Review text

As already indicated by the first reviewer, this paper provides a very readable and accessible contribution on the question of using big data in the field of migration research. It brings together insights from a very interdisciplinary and indeed cross sector (academic and non-academic) set of actors in debates surrounding migration and/or new data sources. I find the presentation of debates useful and illuminating in terms of how such a diverse set of actors approach and engage with these very topical questions. I can recommend the paper but think that it can be made even stronger by more clearly laying out its rational, by thinking about framing the discussion in terms of open questions and challenges (instead of just challenges) and by engaging at least somewhat with a more nuanced appraisal of the difficulties that are inherent in the very broad definition of big data that the authors chose – where engaging with these points would also offer opportunities to engage with more recent debates on very important challenges that the big data hubris brings with it.

Let me address each point in turn. The authors in their introduction note that they hope that the paper will “to assist migration experts in deciding whether the use of big data is appropriate for their work”. However, I feel that the paper as it progresses does not leave sufficient room to indeed consider that in a number of scenarios (particular) big data applications may simply not be the appropriate approach. I feel that the paper to readily embraces the tacit assumption that big data must be good for understanding migration related social, economic and health phenomena. Looking at the references included to make this claim, it is clear that this assumption is very much directly adoped from reports driven and bound to by policy agendas (such as those set by the global compact for migration) that are not critically questioned. However, the paper itself clearly demonstrates that the assumption that bigger is better might not always hold given a gambit of unresolved questions about the quality of the data and how it is being analysed. I feel that the paper would benefit from more clearly highlighting that often times if the potential of big data is referred to that potential has for the most part not been met/demonstrated because of questions that remain open and concrete challenges that may or may not be overcome.

I feel that for that reason it might be useful to make a distinction between what we don’t know in technical terms eg. “How do we manage fragmented data sources across varied spatial and temporal scales?” and concrete challenges that may not be possible the overcome with technical solutions like: “How do power imbalances influence the use of big data?”. The latter making ethics not just a concern of public-private partnerships in this field but very much generally important as some methods can be highly invasive and lack regulatory oversight. In this vein – I think that the paper would benefit from explicitly recognising that - just as migration is shrouded in political narratives so is big data and its assumed beneficial role in making migration legible (with tools that always keep the researcher at least one, often multiple steps, removed from those who they are studying whose actions and practices the research aims to better understand: migrants). Many of the promises of big data migration analysis can be and are critically interrogated (Eg. Is it really less costly in the larger scheme of things if a market for big data migration analytics capitalises on framing migration as risk (Taylor and Meissner 2019) and is it acceptable that refugees and other vulnerable populations on the move become the testing population for new data technologies (see Molnar 2020)?). In this sense and in line with the aim set out in the paper’s introduction to assist migration experts in making a decision, I would welcome some reflection in the discussion part not only what big data migration research might be able to help with but also what kinds of questions it simply cannot address – and I here very much already welcome that the authors note that “A key output of the workshop was a consensus that researchers and decision makers must first ask why they require additional data and whether this  is what all parties, particularly migrants, would want.”

Relatedly and to move on to my third concern, critically reflecting on how a broad definition of big data chosen to frame the article is leaving room for glossing over issues or down playing them – this is particularly evident in the potential for health care interventions section of the paper that essentially point to a big survey (cohort studies) as a way to be better prepared for migrant health needs. As Pelizza and Milan (2020) point out in light of the Covid-19 pandemic there is a dilemma between obtaining the data needed and allowing migrants to remain invisible to often punitive regulatory regimes. The opportunity outlined in that section is thus one that also comes with various challenges and some challenges that might best be mitigated with more traditional methods but that would be exasperated through methods such as social media monitoring which does not meet European data protection standards. In this light I also feel that the section on understanding environmental drivers of migration is illuminating.  It sounds as though the section is making the argument that tech can help us avoid actually engaging with the very complicated decision-making processes involved in migration – there is now much research that alerts us that it cannot. Again given some of the more critical comments in the article I am not sure that message is the intended one so it would be good to sign-post more effectively. It seems to me that there is an important difference between using satellite data to identify environmental risks and using that data to monitor and predict migration. If the researchers concern is indeed with populations vulnerable to climate events would identifying the climate risk and then engaging with the at-risk populations not be a more effective way of combining our analytical tool kit? Overall I would welcome it if the authors tried to see if the opportunities sections cloud at least in part be informed by the challenges sections.

Having presented my concerns I hope it is clear that I think the paper makes an important contribution but that I also think it would benefit from more clearly signalling to the reader that these are ongoing debates, that those debates require attention to understanding why and what data is needed for and that big data also comes with big problems. It is also important to recognise that big data analytics bring many more stakeholders to the table that may have various different motivations for why they might push big data analytics as a supposed panacea for our patchy knowledge about an extremely complex and politically charged social process: migration.

Molnar, Petra (2020): Technological testing grounds. Migration management experiments and reflections from the ground up. Refugee Law Lab and EDRi.

Pelizza, Annalisa; Milan, Stefania (2020): The dilemma of making migrants visible to COVID-19 counting. In Processing Citizenship , 4/28/2020. Available online at https://processingcitizenship.eu/the-dilemma-of-making-migrants-visible-to-covid-19-counting/, checked on 6/16/2021.

Taylor, Linnet; Meissner, Fran (2020): A Crisis of Opportunity. Market-Making, Big Data, and the Consolidation of Migration as Risk. In Antipode 52 (1), pp. 270–290. DOI: 10.1111/anti.12583.



Note:
This review refers to round of peer review and may pertain to an earlier version of the document.