To mitigate planetary risks, human data needs to be backed up in as many places as possible, both on Earth and off Earth. Space provides an ideal environment for data storage, yet also brings with it new challenges.
A large network of Arch™ Libraries (pronounced “Ark”) disseminated across the solar system will guarantee preservation of human data, no matter how much information humans create, for as long as the solar system exists.
The key to making this work is increasing the amount of data we can store per Arch™ Library, and disseminating Arch Libraries to a large number of locations across our Solar System – this is the vision of The Arch Mission.
What is an Arch?
An Arch Library (pronounced "Ark") is a special storage device designed to store and transmit large archives of digital and analog data over very long periods of time (thousands to millions of years), in harsh environments such as outer space and on the surfaces of other planets.
Who are the Archs for?
The Arch Libraries are primarily designed for humans, both in the near future and the distant future. There is also a slight, but non-zero, probability that the Arch Libraries might be useful to non-human intelligent lifeforms in the distant future.
Primary Audience: Humans
Present day: The Arch Mission Foundation™ benefits humans today by inspiring more participation in the movement to settle space. The Arch Mission Foundation provides a way for everyone to get involved by including knowledge to pass down on the Arch Libraries. This is a vehicle for connecting the humanities fields to the space movement, and for making the space movement tangible to everyone today -- not just millionaires and billionaires. The Arch Mission Foundation will also drive innovation and new discoveries about how to make human knowledge portable and connected across the solar system.
Near-future: The Arch Mission Foundation will have tangible benefits in decades to come: Arch Libraries can be used to backup and deliver enormous archives of knowledge (such as all the content we want, even the entire Internet) to human settlements and colonies in space and on other planetary bodies around our solar system over the next several hundred to few thousand years.
Long-term future: In the unlikely event that something terrible wipes out our civilization and/or species on Earth or across several planets, the Arch Libraries will be treasure troves of knowledge for any humans who survive, in eventually rebuilding civilization. They may also potentially be useful to future humanoids that survive and/or evolve again in our solar system, long after our species is gone, millions or billions of years in the future.
Secondary Audience: Non-Humans
There is a low but non-zero probability that the Arch Libraries may be useful to non-human intelligent lifeforms who visit our solar system in the future -- including biological or non-biological (digital) lifeforms that may exist.
We cannot anticipate all the possible form factors, scales, sense organs, types of brains, types of minds, and levels of intelligence or interests, that non-human lifeforms may have.
Designing Arch Libraries that can be understood by anyone, and/or teach anyone how to understand them is a very difficult task.
However, in the event that a species that is not similar to us someday finds the Arch Libraries, we will attempt to design Arch Libraries to be as accessible as we can.
First of all, we have to make some assumptions: We have to assume that at least we are designing for species that exist with a range of scale similar to our own, and that have sense organs similar to our own, and that are capable of, and interested in, figuring out how to understand our data.
This brings up many deep unsolved technical and philosophical questions about the portability of knowledge and meaning across species.
But despite our best efforts, it is still possible that at least some types of non-human life that may visit our solar system in a million years will just find the Arch Libraries to be delicious snacks, or they will find them to be quite uninteresting, or totally impossible to relate to or even to sense at all. Therefore we don’t consider non-human lifeforms to be our primary audience.
Whether we can solve these questions, or even make meaningful progress towards solving them, is still unknown -- but at least we will learn something by trying.
Humanity is entering a period of intensified opportunity and risk. As our technological progress accelerates, so does the risk that we will accidentally or intentionally destroy, or seriously damage, our civilization and environment.
Our civilization is the most advanced to ever arise on Earth, but it is also the most vulnerable. We rely on the most fragile and ephemeral means of data storage in history.
Our paper books, plastic DVD disks, magnetic tape cartridges, and flash drives are highly prone to damage and decay. These storage media last only decades to perhaps a few hundred years, in the best of circumstances, but can easily be destroyed accidentally or intentionally by a wide variety of threats both natural and man-made.
We are the verge of becoming a spacefaring species, yet we still don’t have a plan in place to protect our own data on the planet we already live on.
We think our civilization will never end, even though history is littered with the undecipherable remains of once great but now lost civilizations.
Perhaps we are right and we will be different -- but for our own long-term insurance as a species, we have a responsibility to back up our civilization’s most valuable knowledge and data, just in case we need it later, and frankly, because we can.
As we spread beyond Earth we must develop the means take our data and knowledge with us, in a form that can survive in space, and on new worlds. As we grow into an interplanetary species and civilization, we must find ways to connect and share our knowledge and data across the solar system.
WHY Send Archs to Space?
The primary reason to send Arch Libraries to space and to other planets is to protect against the risks of archiving our civilization on any one planet.
It is guaranteed that on long timescales every planetary body in our solar system will undergo extinction level events such as comet and meteor impacts, tectonic shifts, high intensity solar radiation blasts, etc.
A secondary reason to send Arch Libraries to space is to carry our growing library of human knowledge and data, to every location that humans inhabit or colonize, across the solar system so that it can be accessed locally and quickly, even if there is a temporary or permanent disruption in communication with Earth from any such location.
Thirdly, sending Arch Libraries to space facilitates the development of technologies that lead to the formation of a solar system wide network of data caches, and ultimately a solar system wide Internet.
Has this been done before?
There have been many projects to preserve and protect knowledge over long periods of time, before the Arch Mission Foundation.
Examples of previous and existing long-term archiving projects that have inspired us include the Prehistoric cave drawings and aboriginal artifacts; Native American carvings in stone; The Pyramids of Egypt, Mexico, and Peru; Easter Island; Angkor Wat; The Library of Alexandria; Monastic libraries; The Vatican Archives; The Smithsonian Museum; National Archives and Museums of Natural History; Public Library Systems; Iron Mountain; Global Seed Banks; The Memory of Mankind Project; numerous public and private time-capsules around the planet; The Voyager Disk project; The Internet Archive; The Long Now Foundation; and many others.
But the Arch Mission Foundation is unique among these in that it is an organization designed to continuously use new materials and technologies to archive humanity’s knowledge in space and on the surfaces of other worlds, for at least thousands to billions of years.
How many Archs will there be?
There will be a growing number of Arch Libraries. At first there will be dozens, but by 2020 there will be at least thousands in various locations. As we develop the ability to encode the Arch Libraries onto DNA molecules that can be rapidly and cheaply replicated, it will be possible to distribute millions to billions of Arch Libraries. A day may come -- in less than 100 years -- when everyone will carry a copy of the Arch with them in a piece of jewelry, or perhaps in their own body.
What content will be on an Arch?
The Arch Libraries will include a vast and growing collection of human knowledge. This will include open data sets from the Wikimedia foundation and the Internet Archive, as well as many other data sets contributed by individuals and organizations (genome maps, reference material collections, libraries of data, personal and organizational backup data sets, etc.).
At first, due to storage limitations, Arch data sets will have to be curated to fit various storage media and mission profiles. However, within 10 to 15 years the need for most curation will likely vanish as new storage capabilities (such as new nano, atomic, and molecular storage we have working in various labs) will enable unlimited storage capacity in space at increasingly low costs.
One of the more interesting questions that we ponder is what will be most useful and interesting to potential recipients of the Arch Libraries at different timescales?
One interesting observation is that on very long timescales, and for very advanced recipients in the distant future, it may turn out that the data that is most interesting and/or valuable to them is not our science and technology (they may have the same or greater level of science and technology as we do), but rather our unique history, languages, folklore, traditions, literature, philosophy, religion, art, music, film, and beliefs -- which are impossible to replicate.
While science may be the same everywhere in the universe (because the physical laws and mathematics are invariant), culture and art are truly unique, and may be of more long-term value to preserve and transmit. For example, there may be many equivalents to Einstein across the universe, but there is only one Van Gogh, one Bach, one John Lennon, one Gandhi, one Mother Teresa, one Martin Luther King Jr., one Coco Chanel, etc.
Another very important set of data that is valuable to preserve is biological data - particularly DNA in both digital and molecular form, as well as seeds, and important molecules and compounds. We can store vast amounts of DNA sequences from all species and ecosystems in a relatively small and lightweight container.
Our approach in designing the Arch Libraries is to include different layers of information, for different audiences, with different technological capabilities, where each layer teaches what is necessary to access the knowledge on the next, higher-resolution layer of data. The very first layer of data has to be visible to the naked eye. The next layer, to an optical microscope, then next to a recipient with a laser and a computer, etc.
Who curates the content on the Arch?
Initially the Arch Mission Foundation team is curating the first data sets.
Our philosophy of curation favors the idea of “curating the curators” whenever possible. What we mean by this is that when possible, we curate collections of data that are already curated by other groups of credible curators (for example, the Wikpedia, or other encyclopedias; Project Gutenberg, etc.).
In certain areas, where no source of curated data exists, and/or where we cannot get that data to include in the Arch™ Library for various reasons, we may have to acquire and curate the data ourselves. But for the most part we hope to reuse the good work of other curators.
Our hope and objective as meta-curators is to try to curate other curated data sets that represent a broad and inclusive range of perspectives, experiences, ethnicities, nations, traditions and cultures, so as to accurately reflect the entire scope of human diversity.
One way to do this is to also enable everyone who participates in the Arch to contribute their own wisdom and knowledge, or other special archives for backup in a portion of the Arch Libraries.
As the Arch Mission Foundation grows and matures we will also form a number of boards and committees designed to protect the organization and its data against bias and to ensure that the full scope of human experience is represented in the Arch Libraries.
However, no matter what we do, it is never going to be possible to satisfy every different group completely in any curated data collection.
The only solution that will satisfy everyone completely, is to have so much storage space that curation really is not necessary, there is room on the Arch Libraries for virtually everything that anyone wants to include.
Virtually unlimited storage is actually a real possibility within decades, thanks to new technologies that are already in the lab today.
Is there data or knowledge that will not be included?
This is a question we have struggled with. It’s not easy to come to a decision about it. There is a large amount of knowledge and data that might be considered by some to be offensive, toxic and/or too dangerous to hand down the future. But where do we draw the line, and who gets to decide what that line is? As soon as we decide to censor ourselves it’s a slippery slope and leads to very dangerous territory -- perhaps worse than the problem we were trying to avoid.
Therefore, we believe it is best to include everything -- all perspectives and beliefs -- but in a way that reflects the relative proportions of humans who support those ideas today across the population.
For the time being, while we have storage limitations (for a few more decades), we also have to make these decisions based at least to some extent on how much room we have. This means that for now, we will seek to include broad curated data sets, such as open data sets like the Wikipedia and other broad archives, that represent the full range of human beliefs and knowledge and are curated by large communities that take great pains to adjudicate these questions.
Ultimately, in addition to large curated archives of open data, and data sets that are offered to us by other organizations that own them, we also hope to provide every human and every organization and nation with some space on the Arch Library, with which to do with what they wish. And for those who want more space, and can provide the funding to get it, more space will be available. These goals will begin to become possible soon, and within a few decades will be entirely achievable.
How Much Data Fits onto an Arch Device?
Individual Arch storage devices will always be finite. But sending endless numbers of such objects, cheaply and continuously, into space will be possible very soon, and this means we effectively have infinite aggregate data capacity on the horizon.
Even on a per Arch device level, within 10 years, we will be able to write hundreds of Terabytes, and potentially even several Petabytes, of data, to individual optical storage objects, using current methods we are already using, and that are developing very quickly.
Similarly we already have proof that we will be able to store and massively replicate vast data sets on non-living DNA molecules cheaply and easily. And we are starting to explore the limits classical physics for long-term archival data storage, and beyond that even quantum storage.
We are also exploring ways to send remotely updateable storage devices, and remotely readable objects. There is a tradeoff between long-term durability and the ability to write many times to an object -- but since our goal is durability, we don't mind sacrificing the ability re-write to the same data object: there is a lot room in space to simply send more data archives into.
And delivering these objects may not even require expensive spacecraft eventually. We may be able to simply shoot them into locations around the solar system from orbit, using various devices that can accelerate them cheaply from Low Earth Orbit or other locations around the solar system (such as railguns, centrifuges, laser propulsion, or electromagnetic propulsion, to name a few options).
A network of robotic data object routers may one day be able to send, catch, route, and then pass on these objects on various paths between the planets -- like an endless data highway where the "packets" are actually tiny big data Archives.
Of course this is only necessary if the goal is durability over long periods of time. If all that is needed is short term durability then beaming data with lasers, or through other transmission methods, will work quite well. But our goal is long-term data durability.
Arch storage devices will initially be write-once -- so to send an update, we will need to send another Arch data object to the same location or set of locations.
This could get interesting. Imagine, for example, that we could make rings of small Arch data objects -- each one the size of small crystal or even just a grain of sand, at various distances from a planet, like the rings around Saturn. Imagine that we could cheaply produce these and send billions or trillions of them, with high redundancy, to various locations. These would literally be data clouds comprised of individual Arch objects.
To send more data to such collection, we just send more Arch objects to a specific part of one of the rings. To read out the data, a robotic drone that lives in the ring can navigate over to the sector that contains what is requested, read it out (by literally retrieving some copies and reading them), and then transmit its contents, or accelerate the object itself, to another location.
And eventually it may be possible to remotely read and write to clouds of data objects in space. One model for this is a hybrid where autonomous robotic storage devices contain transmitters and receivers for less durable storage into shorter term caching, with remote read-write capabilities, but for long-term archival storage these robots can manufacture and deploy static write-once objects and deploy them into potential locations in space, where such objects can later be easily collected, read back into short term memory, and even added to and re-deployed to other locations. The possibilities are intriguing.
What if nobody finds or understands the Arch libraries -- And why build them if that might happen?
Can we guarantee that there will be recipients of the Archs? Can we guarantee that beings in the distant future who find the Archs will be able to use them? No. We cannot make those guarantees.
But we can guarantee that if we do not make the Arch libraries there will definitely be no chance of anyone ever finding them or ever being able to make sense of them.
At least by making the Arch libraries, there is a chance -- a non-zero probability (perhaps a reasonably large one) -- that someone will find the Archs and be able to learn from them in the distant future.
In other words, if you want guarantees, then think about it like this. If we don't make the Arch libraries, then it guarantees they will not be found and cannot ever benefit anyone. If we do make the Arch libraries it guarantees that there is a non-zero probability that they will be found and will be useful to recipients in the distant future.
It's worth doing if it has reasonable odds of helping beings in the future. But even more important is the fact that building the Arch libraries benefits beings today, and in the near future, and perhaps for hundreds of years in the meantime.
The act of making the Arch libraries changes who we are as a species and orients us towards becoming a spacefaring civilization that lives on multi-thousand year timescales over vast interplanetary distances. It's a new way of thinking.
The act of making the Arch libraries brings us together and gives us a common purpose and mission -- a common corpus that we can be proud of passing down -- a common legacy and gift to the future of our data and knowledge, including our biological legacy as well.
Making the Archs also will catalyze advances and richer conversations and collaborations in the near-term, about carrying big data into space -- across science, technology, the startup sector, the humanities fields, the public sector, and the space industry.
Important commercial and technological benefits may come from this process that will benefit society and perhaps even the human race. And that is worthwhile!
So there is great potential for benefit, and really not much harm in trying. Even if nobody lives in our solar system in 10 million or a billion years, the act of building monuments is as or more beneficial for the builders as to potential recipients in the future.
How Will Anyone Understand What is in the Archs?
We plan to send thousands of larger Archs on crystal and metallic disks - and billions or trillions of Archs (written onto molecules of DNA). This will take decades if not longer - It’s an ongoing process that we envision will never end. Within this nearly infinite amount of storage we will include all human knowledge that anyone wants to pass down, eventually.
This will include the complete knowledge we have of all human libraries and everything there is to know about human languages, cultures, histories, belief systems, the arts and sciences, technologies, nations, and the peoples and other species of the Earth.
This will also include huge amounts of knowledge specifically collected to help future recipients learn to understand the knowledge we include in this endless cloud of data — including thousands of encyclopedias, dictionaries, linguistic references, reference works, visual teaching materials, tutorials, textbooks, and millions of other documents, including all the open source software anyone wants to pass down as well. And in addition we will include the plans, blueprints, documentation manuals and more. (But we still don’t expect anyone to read the documentation manuals, at least, even in a million years.)
Why not send a big hard drive?
Effectively we are sending big hard drives -- but written onto new materials (like quartz, or DNA), using new techniques (like exotic lasers, ion beams, or special genomic technology), so that the data is extremely high capacity, lightweight and small, and durable for thousands to billions of years.
Existing hard drives don’t last long on Earth, let alone in space or on other planets. While putting spinning disks or flash drives in space does work for a few years, it is not a viable strategy for long-term cold storage or archiving of data. Furthermore today’s hard drives don’t have the storage capacity needed for extremely large long-term archives.
Space is a very harsh environment - with extreme temperature changes, high energy cosmic rays, electromagnetic fields, micrometeorites, no oxygen, vibrations from spaceflight, etc. The surfaces of other planets also bring with them a range of unique challenges.
To store and archive data in space requires new materials and approaches.
How will Archs be decoded?
The Arch Libraries are designed with different layers and types of data encoding for different types of recipients with different levels of technical ability.
Our goal is for each Arch Library to include layers that disclose useful content for each level of audience that might encounter them.
The first layer must be visible to the naked eye and will at least indicate “this object is special and contains something valuable” using symbols and images.
The next layer of information is visible to an audience with an optical microscope. We achieve this using different materials and techniques but the effect is the same: We can essentially encode large amounts of data as tiny images, like microfilm, but even smaller.
This enables us to include visual encyclopedias, language codices, dictionaries, photos, diagrams, pages of documents in a form that can be seen with a microscope, without any knowledge of how to decode digital data.
Data which is within the range of the spectrum that our eyes can see, and which can be detected optically with the eye or a microscope can include both analog (pictures, pages of text) and digital (binary or other encodings) data.
On a per disk basis, optically we can store hundreds of thousands to hundreds of millions of pages of text and images at the nano-optical to subatomic scales using special technologies -- but for 500X magnification microsocope level visibility the limit is around tens of thousands of analog pages or images.
Beyond the visual layer, we also are encoding higher resolution data for audiences with access to lasers and computer based digital technology. Here we can store truly big data sets, but it is more difficult for a recipient to access this data. To facilitate this, on the visual layers above this layer, we will include instructions for how to access and decode digital data, and if necessary how to build a computer and laser to do so.
Beyond the laser based digital layer there may also be layers that require the ability to detect and decode molecular, atomic scale or subatomic scale (quantum or holographic) information. We can already encode data in this way, but reading it requires very advanced technology.
The question of how to make the data “decodable” is however, much more difficult than just how to make the data detectable and retrievable.
Making the data decodable means making it possible for a recipient to actually understand what the data means. This means they have to understand the file format, and any markup or metadata, and the content as well - and this means they have to map it to their own mental model of the world. If they are humans and speak something related to an existing human language - this is not hard to do. But if they are not humans, or have nothing in common with us, then it is not an easy task!
One solution is to include one or more primers that teach how to understand the data formats, languages, formats and other encodings, of the data. To this end, we are including many different kinds of primers created by others as well as primers that may be created by the Arch Mission Foundation.
But we also believe that, at least in the case of technologically advanced recipients, data mining and machine learning may be one of the main techniques that will be used to decode the content of the Arch Libraries.
With that in mind we are working on designing the Arch Libraries to be helpful to future dataminers and AI algorithms. For example, we are including many different versions and translations of the same texts, side by side, with a lot of redundancy and other related information nearby to facilitate finding and understanding the patterns and relationships among concepts.
Beyond simply making the Arch Libraries passively decodable however, we are also thinking more long-term -- how can we make the Arch Libraries proactively decodable -- how can we design the Arch Libraries to proactively help in the process of teaching what they know to a future recipient? For that we may need to use artificial intelligence, and perhaps even virtual or augmented reality technologies.
What if the best way to communicate with future recipients who don’t understand the Arch Libraries is via an AI that is carried by the Arch Libraries and is capable of interacting with them, and/or their AI agents, to figure out how to best communicate and map ontologies sufficiently to then begin teaching, asking questions, and learning? We don’t know the answer to this, but it’s an interesting question we’re thinking about.
Which ArcHS are in the works?
Presently we are working on the Solar Library, the Lunar Library and the Earth Library. Collectively these will comprise at least thousands of different copies and locations of Arch Library datasets. They will grow eventually to perhaps millions of copies and locations. The Mars Library and other locations around the solar system are also in the works.
How Can I Get A Copy of the Arch?
We are working on several ways that individuals and organizations can get copies of Archs that we create. We are not ready to discuss these plans. We are a not-for-profit organization and we are thinking this through carefully.
How can I contribute my data to the Arch?
We are working on a plan to enable everyone to contribute to the Archs. Meanwhile you can indirectly contribute by helping the Wikimedia Foundation and contributing to large open data sets, which we will be sharing via the Arch Libraries as well. We will be enabling you to add your own knowledge and wisdom and history to the Archs as well.
How can YOU help and participate in the Arch Mission Foundation's Work?
The Arch Mission Foundation is a broad participatory project and we hope to be very inclusive. There are many ways you can help. Here are a few:
Donations and funding
The most important need we have presently is financial support. To achieve our mission we must purchase scientific equipment for producing Arch Libraries, rent and build out our own lab for building Arch Libraries, and raise funding for Arch Library related scientific research and development at universities and small companies that provide exotic capabilities, production costs and materials costs for creating Arch Libraries, launch costs, testing, legal, administrative help and consultants and operational staff, travel, accounting, computers, hardware, software, and other expenses related to running and growing the organization and its work. Our annual budget needs to be in the several millions of dollars per year to fully execute our roadmap on the timeline we would like to achieve.
Offer a Storage Location for the Arch Library
The Arch Mission Foundation is looking for locations on Earth and in space for storing copies of the Arch Library.
These locations should be able to preserve the Arch for long periods of time.
Examples would be deep underground or undersea locations, caves, mountains, aboard satellites or rockets, locations in space stations, interplanetary missions, and on lunar or mars landers and rovers.
We will also distribute Arch Library copies to museums, schools, and other organizations in the future.
If you have or represent the rights to a large dataset of important human knowledge that should be on the Arch Libraries, we welcome your contribution of that data. The data will be stored on Arch Libraries in space that will not be accessed by anyone for thousands to millions of years, and will not be shared openly or given away freely without your permission. Any data on Arch Libraries that are to be used sooner can be encrypted such that you may control who can access it in the future. But we prefer open data sets or at least data sets that are not encrypted for now.
Once exception - under fair use laws - it may be possible for the Arch Mission Foundation to archive data that you don’t own but have licensed -- such as your purchased music and video or books. If you would like to contribute that material for inclusion on the Arch, as your own backup in space, you may contribute it in the future.
Launch and Delivery Capabilities
If you happen to have a rocket company, or a satellite company, or a space agency, or a moon base, or any of the above - we would be happy to give you Archs to carry to your locations. We are working closely with many organizations that provide these capabilities and would like to work will all of them. We believe that it is best for everyone if the Archs are on as many missions, to as many locations, as possible.
Advisory Board Members
We are always looking for advisory board members who are at the tops of their fields, can make a real contribution to our work, and who bring influential communities and networks to the Arch Mission Foundation. While these are volunteer positions, they do provide access to a community of some of the most interesting and influential minds on the planet. And for this reason, and the fact that we aren’t funded sufficiently to manage a large group of advisors yet, we are also a bit picky about who we accept. If you fit the criteria here, and would like to apply, please contact us here with a brief note of introduction.
The Arch Mission Foundation is the work of an elite group of volunteers, and you can be one of them. We will be posting needs and volunteer opportunities here in the future.