This is a searchable archive of "old" 4Chan posts. Currently it is hosting (part of) the "Archive Ten Billion" dump of 2005-2008 era 4Chan threads. Among these are the very earliest posts for many boards: /fa/, /fit/, /hc/, /jp/, /n/ (the Transportation reset), /r9k/, /s/, /sp/, /tg/, /toy/, /trv/, /x/. It includes posts from almost all 4Chan boards of the era*
*missing /f/ (flash), /j/ (the sekrit club), /con/ (temporary board for conferences) and maybe /yg/ (yogurt?)
The archive is missing a large chunk of the original archiver's coverage of /b/ and /p/. See below for more.
For a different way to browse some of the contents of this archive, check out 4museum (unaffiliated, est. 2019).
This archive launched in Oct. 2021, shortly after 4Chan's 18th anniversary.
About the 'Archive Ten Billion' data
- This archive was mostly generated from the XML version of the dump, which is missing /b/ and /p/ but seems to be otherwise whole. The /b/ and /p/ posts collected here were retrieved from the MySQL version of the dump. The /b/ XML .tar.gz was missing from day 1 but was referenced in the sqlite manifest.
- The HTML version of the dump is missing post numbers and timestamps. The /b/ HTML dump is also corrupt, although a large proportion of the threads can apparently be retrieved, making it a good resource for /b/ threads not hosted here.
- The upload also included a (corrupt) copy of the raw MySQL database files, which contain second-accurate timestamps for /b/. The corrupt .tar.gz is completely missing several required files (or the tar's file table is corrupted). I was able to retrieve about 36M /b/ posts/2.8M /b/ threads from the .MYD files from before and after the point of visible gzip corruption, as well as some of /p/. There are still about 3M /b/ threads missing.
- There are about 114 million posts in the collection across about 7.5 million threads, covering 44 boards. See the stats page.
- The earliest posts in the archive date to 2005 (/r/). The data for most other boards dates back to 2006, and ends some time in 2008. The newest posts date to 2008-12-04. Few of these posts, as of Oct. 2021, are replicated in other sources. Exceptions include /jp/, which was archived by Fuuka archivers from day 1, and /a/ through most of 2008.
- The original archive was text-only. Those posts that were marked as having images will be displayed here with a placeholder image.
- The dump was uploaded by Jason Scott of Internet Archive in 2018 (first teased way back in 2009) and attributed to an anonymous donor.
- All image data minus the binary presence/absence of an image associated with a post is missing in the source. This includes original filenames, media file sizes, uploaded thumbnail/full-size media filenames, and media md5 hashes.
- No data regarding deletion status or time was preserved in the source.
- Capcodes are missing from the source.
- Some ban messages seem to be missing from the source. Others are preserved.
- Posts were only timestamped to the minute in the source. This means that posts ordered by timestamp are often disordered by number, since many will share the same timestamp.
- The source included a little unintentional HTML mess which was stripped, e.g. some 'Comment too long, click here' tags.
- The comment body source is pure HTML. The HTML tags needed to be stripped to fit the native Asagi format for this archive.
- Emails have been anonymized, just to be safe.
- Oekaki post data (drawing duration etc.) has been stripped.
- /n/ switched from 'News' to 'Transporation' on 2008-02-19, so posts/threads from both of these iterations of /n/ are in the archive. Before switching to 'News' in 2006, /n/ was 'Nature' in 2005 and 'Trains' in 2004 (neither covered in this archive). The post count was reset when /n/ transitioned to Transportation. The archiver may have overwritten some posts from News with posts from Transportation as a result of the reset.
- /sp/ was killed in 2006 and was brought back with a post number reset on the same day as /n/ became Transportation, 2008-02-19. The archiver may have overwritten some posts from the first iteration of the board with the second.
- /b/'s fortune text is missing color (missed the pound sign preceding the color ID in regex).
- The default OP thumbnail is low-res for some boards (reply thumbnail file overwrote OP thumbnail file with identical timestamp).
Reports / Contact
For personal information takedown requests, use the report function or email admin[@]sage.moe.