Hide table of contents
5 min read 10

17

This is a request for comments: The post describes a potential future project that might benefit multiple effective altruistic organizations, particularly in global health and development. I would love to hear your thoughts on all angles of this:

  • Is this worth doing? How would you rate importance, tractability, neglectedness?
  • All other thoughts are welcome too, even (especially?) if they are critical.
  • It would be particularly great to hear from practitioners who work in developing countries.

The full text of the article follows:


For many regions of the world, there are no authoritative and openly accessible place names. Solving this problem would help all development and humanitarian work. And solving this problem might be possible with the right mix of people and skills. I'm considering to dedicate my work time to this. The present post explains the idea, with the aim of getting as much feedback as possible.

A fictional map, showing a sparsely populated region and a place with unknown name.

A fictional map, showing a sparsely populated region and a place with unknown name.

Problem: wait, no place names?

Approximately two billion people live in places that are not labeled on any map, like this village in the Democratic Republic of Congo. Google Maps shows "Walikale" as the address, but that's a territory of almost 25,000 square kilometers, or the name of a town about 100 km away. Other maps and gazetteers are no better: OpenStreetMap, Who's on First

The lack of place names has many negative effects, like slower development, difficult humanitarian work, or worse governance. Consider the example of bednet distributions against malaria. This is familiar to me, because I have worked at the Against Malaria Foundation for the last 2.5 years. Bednets are an essential tool to fight malaria, and so countries like the Democratic Republic of Congo distribute them to all households every three years. Without reliable lists of location names:

  • Planning the distribution is hard: how to communicate where nets should go, and how many are needed in each place?
  • People miss out on nets: entire villages might be left out if the distribution agents aren't aware of them.
  • Monitoring is hard: Typically, AMF selects a subset of locations randomly and sends independent monitoring teams there to verify the distribution. This only works if there are place names.
  • Inconsistencies make everything difficult: For example, we want independent monitoring teams, but we can't use their data if location names don't match those used by the distribution teams.
  • New technologies cannot be used: Distributions would benefit from better digital data collection tools, dashboards, satellite-based population estimates, etc. The adoption of these tools is often difficult because they require place names and boundaries.

An important consequence is that distributions cost more. Funders regularly buy +5% or +10% extra nets to ensure that there are no stockouts despite low-quality planning data. In addition, there is an engineering cost of dealing with missing and inconsistent data. When I was at AMF, a significant part of my work time was spent on these problems.

Bednet distributions are an area that I know well, but the example generalizes. Most non-profits, NGOs, and governments working in least developed countries would benefit from better place names. Better data would improve routine healthcare, make it easier to organize fair elections, and much more. Yet, such is nobody's core responsibility, and thus nobody has solved the problem yet.

The goal

I want to build the world's best dataset of place names, boundaries, and metadata (such as population estimates), with a focus on least developed countries.

The current state of the art looks roughly like the graphic below, which comes from the excellent overview State of the Gazetteer in 2023 by Who's on First. Regions with orange dots have data on local administrative areas (e.g., village or town names/boundaries). They are missing in most of Africa and large parts of Asia. I want to make sure this map becomes fully covered.

Coverage of local administrative names in the WOF gazetteer

Coverage of local administrative names in the WOF gazetteer (source)

The most useful data has places with on the order of 100 households. This is a typical rural village or a small urban neigborhood. At this level of detail, the data is fine-grained enough for detailled planning. And while practicioners can always use higher levels of the location hierarchy or merge places, the opposite is more difficult.

To make the data fully useful, we also want boundary polygons. This allows us to link places to other data sources, e.g., GPS coordinates or satellite imagery.

Finally, we want estimates of the coverage and quality of the data. We want to know what percentage of settlements and people are covered, and get a list of the unknown settlements. For metadata like population numbers, it would be great to have confidence intervals or some other measurement of accuracy. All data should also come with information on provenance and recency.

Candidate solutions

Solutions to this problem of missing place names involve identifying good data sources, cleaning and aggregating the data, publishing the result in all relevant places to ensure its use, and finally maintaining the data so it stays up-to-date.

A few candidate data sources:

  • Governments, ministries of health.
  • NGOs like the Against Malaria Foundation, but also local actors such as SANRU in DRC. These could provide not only place names, but also GPS coordinates of households or population estimates.
  • The UN Office for Coordination of Humanitarian affairs (see for example their data sets at humdata.org).
  • Academic initiatives such as the Columbia Population Research Center or GRID3
  • Initiatives by for-profit organizations like Facebook.
  • New satellite-based data sets like World Settlement Footprint and Sentinel-2 Land Use.

The people at the Who's on First gazetteer have recently completed gathering and importing place data for India. I highly recommend their write-up of the Karmashapes project. Their approach and techniques might translate to other parts of the world.

Potential challenges

Building the world's best place data set is not without challenges. Below is an initial bullet list. This is an area where feedback and ideas would be particularly welcome.

  • There are many existing initiatives addressing parts of the problem. We need to better understand why they have not yet produced the data that we'd like to see.
  • Authoritiative data might not exist. People have disputes around boundaries. Some places (e.g., South Sudan) might now know or want the concept of "village name".
  • There is a risk of colonialistic thinking, where we (the rich first-worlders) believe to know what less developed countries need. Also, the data is potentially dual-use: it can serve dictators as well as humanitarian organizations.
  • The data set needs to be maintained to stay relevant.
  • The data set should be published under an open license to maximize usage.
  • It's unclear how to make a living when working on this project.

Conclusion

With this early project write-up, I am primarily looking for feedback. Any thoughts are welcome: Is this a good idea? Why or why not? Is this an important problem to solve? How could it be done? Whom should I talk to? What is missing?

Don't hesitate to reach out to me, Jonas Wagner ltlygwayh@gmail.com.

Comments10


Sorted by Click to highlight new comments since:

OpenStreetMap's humanitarian team (https://www.hotosm.org/) works on similar topics, it seems like to me.

Thanks! This seems very relevant. I will try to contact the team.

Are you familiar with What Three Words?

Yes, I know about What Three Words. Thanks for the suggestion! It's a good opportunity to clarify the different aims of my project and W3W.

W3W is essentially the same as a GPS coordinate, except more memorable and easier to pronounce. A W3W place does not necessarily correspond to anything particular in the real world (like a settlement). Thus, W3W does not provide any added value for planning purposes.

There are some other downsides, such as W3W being proprietary and based on (IMO) bad design choices (e.g., hard to localize).

A better alternative to W3W is https://maps.google.com/pluscodes/. Plus Codes are indeed useful in places where some form of address is needed, and they are seeing some adoption in developing countries.

My goal is somewhat different: I would like to collect and publish the natural, given names of places, along with boundaries and metadata. The ideal unit here is the settlement, village, community, or neighborhood -- this is the level at which the data would most support humanitarian work, health services, elections, infrastructure development, etc.

There are some other downsides, such as W3W being proprietary and based on (IMO) bad design choices (e.g., hard to localize).

For more on this, and why I think we shouldn't advocate for W3W, see: https://shkspr.mobi/blog/2019/03/why-bother-with-what-three-words/ for theoretical reasons and https://w3w.me.ss/ for some practical examples.

As you mention, https://plus.codes is indeed much better, although this is only tangentially related to your project

Can you expand on why the ideal unit is "the settlement, village, community, or neighborhood"?

Here are some reasons why I think that units of ~100 households are ideal. The post itself has more examples.

  • It's best for detailed planning. There is a type of humanitarian/development work that tries to reach every household in a region. Think vitamin A supplementation, vaccination programs, bednet distributions, cash transfers, ... For these, one typically needs logistics per settlement, such as a contact person/agent/community health worker, some means of transportation, a specific amount of bednets/simcards/..., etc.

    Of course, the higher levels of the location hierarchy (health areas, counties, districts, ...) are also needed. But these are often not sufficient for planning. Also note that some programs use other units of planning altogether (e.g., schools or health centers), but the settlement is common.

  • It's great for monitoring. The interventions mentioned above typically want to reach 100% settlement coverage. It makes sense to monitor things at that level, i.e., ensure that each settlement is reached.

  • It's great for research. Many organizations use household sampling surveys. These are typically clustered, which means that researchers select a given number of "enumeration units", and then sample a fixed number of households in each unit. Ideally, these enumeration units have roughly even size, clear and well-understood boundaries, and known population counts. The type of locations that I'm aiming for would make good enumeration units.

  • This type of place name is used and known. For example, people in the region will know where "Kalamu" is. There will likely be a natural contact person, such as a village chief. There will be a road that leads there and a way to obtain transportation. One can ask questions like "is there cellphone coverage in Kalamu" and get a good answer. In the majority of cases, a place name is a well-understood, unambiguous and meaningful concept.

The final reason is about data availability: settlement names are usually the most detailed names available, and their names are reasonably stable and accepted. The data exists, we only need to collect and aggregate and publish it. In contrast, streets or buildings often don't have names, so we can't easily have more fine-grained data than place names. Plus, there are some solutions like Plus Codes for situations where address-like data are preferred.

I can also confirm that an early employee of W3W told me that supporting development work was one the main original aims of W3W.

After a quick read, this was my first thought too (ie that promoting & advocating for the use of "what three words" might be an easier solution)

As an aid worker, I think this is a very interesting idea. Many folks have already mentioned a few tools that can be useful, such as OpenStreetMap. I would also love to see one that could also assess the current needs of these communities. 

Curated and popular this week
 ·  · 52m read
 · 
In recent months, the CEOs of leading AI companies have grown increasingly confident about rapid progress: * OpenAI's Sam Altman: Shifted from saying in November "the rate of progress continues" to declaring in January "we are now confident we know how to build AGI" * Anthropic's Dario Amodei: Stated in January "I'm more confident than I've ever been that we're close to powerful capabilities... in the next 2-3 years" * Google DeepMind's Demis Hassabis: Changed from "as soon as 10 years" in autumn to "probably three to five years away" by January. What explains the shift? Is it just hype? Or could we really have Artificial General Intelligence (AGI) by 2028?[1] In this article, I look at what's driven recent progress, estimate how far those drivers can continue, and explain why they're likely to continue for at least four more years. In particular, while in 2024 progress in LLM chatbots seemed to slow, a new approach started to work: teaching the models to reason using reinforcement learning. In just a year, this let them surpass human PhDs at answering difficult scientific reasoning questions, and achieve expert-level performance on one-hour coding tasks. We don't know how capable AGI will become, but extrapolating the recent rate of progress suggests that, by 2028, we could reach AI models with beyond-human reasoning abilities, expert-level knowledge in every domain, and that can autonomously complete multi-week projects, and progress would likely continue from there.  On this set of software engineering & computer use tasks, in 2020 AI was only able to do tasks that would typically take a human expert a couple of seconds. By 2024, that had risen to almost an hour. If the trend continues, by 2028 it'll reach several weeks.  No longer mere chatbots, these 'agent' models might soon satisfy many people's definitions of AGI — roughly, AI systems that match human performance at most knowledge work (see definition in footnote).[1] This means that, while the co
saulius
 ·  · 22m read
 · 
Summary In this article, I estimate the cost-effectiveness of five Anima International programs in Poland: improving cage-free and broiler welfare, blocking new factory farms, banning fur farming, and encouraging retailers to sell more plant-based protein. I estimate that together, these programs help roughly 136 animals—or 32 years of farmed animal life—per dollar spent. Animal years affected per dollar spent was within an order of magnitude for all five evaluated interventions. I also tried to estimate how much suffering each program alleviates. Using SADs (Suffering-Adjusted Days)—a metric developed by Ambitious Impact (AIM) that accounts for species differences and pain intensity—Anima’s programs appear highly cost-effective, even compared to charities recommended by Animal Charity Evaluators. However, I also ran a small informal survey to understand how people intuitively weigh different categories of pain defined by the Welfare Footprint Institute. The results suggested that SADs may heavily underweight brief but intense suffering. Based on those findings, I created my own metric DCDE (Disabling Chicken Day Equivalent) with different weightings. Under this approach, interventions focused on humane slaughter look more promising, while cage-free campaigns appear less impactful. These results are highly uncertain but show how sensitive conclusions are to how we value different kinds of suffering. My estimates are highly speculative, often relying on subjective judgments from Anima International staff regarding factors such as the likelihood of success for various interventions. This introduces potential bias. Another major source of uncertainty is how long the effects of reforms will last if achieved. To address this, I developed a methodology to estimate impact duration for chicken welfare campaigns. However, I’m essentially guessing when it comes to how long the impact of farm-blocking or fur bans might last—there’s just too much uncertainty. Background In
 ·  · 4m read
 · 
SUMMARY:  ALLFED is launching an emergency appeal on the EA Forum due to a serious funding shortfall. Without new support, ALLFED will be forced to cut half our budget in the coming months, drastically reducing our capacity to help build global food system resilience for catastrophic scenarios like nuclear winter, a severe pandemic, or infrastructure breakdown. ALLFED is seeking $800,000 over the course of 2025 to sustain its team, continue policy-relevant research, and move forward with pilot projects that could save lives in a catastrophe. As funding priorities shift toward AI safety, we believe resilient food solutions remain a highly cost-effective way to protect the future. If you’re able to support or share this appeal, please visit allfed.info/donate. Donate to ALLFED FULL ARTICLE: I (David Denkenberger) am writing alongside two of my team-mates, as ALLFED’s co-founder, to ask for your support. This is the first time in Alliance to Feed the Earth in Disaster’s (ALLFED’s) 8 year existence that we have reached out on the EA Forum with a direct funding appeal outside of Marginal Funding Week/our annual updates. I am doing so because ALLFED’s funding situation is serious, and because so much of ALLFED’s progress to date has been made possible through the support, feedback, and collaboration of the EA community.  Read our funding appeal At ALLFED, we are deeply grateful to all our supporters, including the Survival and Flourishing Fund, which has provided the majority of our funding for years. At the end of 2024, we learned we would be receiving far less support than expected due to a shift in SFF’s strategic priorities toward AI safety. Without additional funding, ALLFED will need to shrink. I believe the marginal cost effectiveness for improving the future and saving lives of resilience is competitive with AI Safety, even if timelines are short, because of potential AI-induced catastrophes. That is why we are asking people to donate to this emergency appeal