On January 1, 2023, the Internet turned 40. In the forty years since its inception, users have uploaded an unfathomable amount of data to the world wide web. In 2022 alone, people uploaded about 97 zettabytes (97 trillion gigabytes) of new data to the web. And that number increases every year. Most consider the Internet a sort of modern-day Library of Alexandria. You can find answers to (almost) any question. However, the links to many older pages no longer work. These dead pages have succumbed to a phenomenon known as “link rot.”
What causes link rot, and why is it an important issue?
According to The Verge, about 72% of links generated in 1998 have succumbed to link rot. Several reasons can cause a URL (uniform resource locator) to stop working and display the dreaded “error 404” message. For example, a web page’s owner could change hosts, the domain name expired, or the site crashed altogether.
So, why is link rot a problem? In 2023, our lives revolve around the Internet. According to Pew Research Center, 85% of Americans say they go online on a daily basis. And nearly a third say they constantly use the web. And especially since the dawn of the social media age, we have used the Internet to connect with friends and family.
In the last decade-plus, we’ve saved many of our fondest memories on Facebook’s (or some other social media site’s) servers. It’ll likely be some time before our old profiles head the way of the dinosaur. However, it will almost inevitably happen (especially if you no longer use the platform).
Link rot also wreaks havoc on journalists, researchers, and academics trying to cite old material. For example, according to Harvard, over 70% of web pages in a law journal study don’t link to original sources. About half of the links in United States Supreme Court opinions studied were rotten. And about three-quarters of the links researchers examined led to content different from what they cited. Additionally, a study by Nanyang Technological University in Singapore showed the issue impacts “.edu” links the most, at 36%.
How can we save our data?
Several organizations and non-profits are working on archiving old data on the web. The Internet Archive is a digital library founded by computer engineer Brewster Kahle in 1996. The public can freely upload and download data to and from its collection. It also saves old, defunct web pages and allows anyone to access them through its browser, the Wayback Machine. In 2023, there are 811 million old web pages archived on the Wayback Machine.
And in the academic realm, where link rot is a more pressing matter, Perma.cc is the go-to archival service. The Harvard Law School Library Innovation Lab founded the academic archive in 2013 in direct response to the issue. And in 2016, The Institute of Museum and Library Services awarded them a $700,000 grant to expand Perma.cc. It has a crucial difference from the Wayback Machine in that it doesn’t use web crawlers to scour the Internet.
On an individual level, your best bet for saving your digital memories is to store them off the Internet. Social media platforms are increasingly adopting inactive profile deletion policies.