Wikipedia offers free copies of all available content to interested users. These databases can be used for mirroring, personal use, informal backups, offline use or database queries (such as for Wikipedia:Maintenance). All text content is multi-licensed under the Creative Commons Attribution-ShareAlike 3.0 License (CC-BY-SA) and the GNU Free Documentation License (GFDL). Images and other files are available under different terms, as detailed on their description pages. For our advice about complying with these licenses, see Wikipedia:Copyrights.

Where do I get...

  • Dumps from any Wikimedia Foundation project: http://download.wikimedia.org/
  • English Wikipedia dumps in SQL and XML: http://download.wikimedia.org/enwiki/
    • pages-articles.xml.bz2 - Current revisions only, no talk or user pages. (This is probably the one you want. WARNING: 5 GB compressed, up to 20 times that size uncompressed.)
    • pages-current.xml.bz2 - Current revisions only, all pages
    • pages-full.xml.bz2/7z - Current revisions, all pages (includes talk and user pages)
    • pages-meta-history.xml.bz2 - All revisions, all pages not available for enwiki, see below
    • pages-meta-history.xml.7z - All revisions, all pages not available for enwiki, see below
    • abstract.xml.gz - page abstracts
    • all_titles_in_ns0.gz - Article titles only
    • SQL files for the pages and links are also available
    • Caution: Some dumps may be incomplete - pay attention to such warnings (e.g. "Dump complete, 1 item failed") near the dump file.
  • To download a subset of the database in XML format, such as a specific category or a list of articles see: Special:Export, usage of which is described at Help:Export.
  • Wiki front-end software: Wikipedia:MediaWiki.
  • Database backend software: You want to download MySQL.
  • Image dumps: See below.

In the http://download.wikimedia.org/ directory you will find the latest SQL dumps for the projects, not just English. For example, (others exist, just select the appropriate two letter language code and the appropriate project):

  • English Wikipedia dumps: http://download.wikimedia.org/enwiki/
  • Spanish Wikipedia dumps: http://download.wikimedia.org/eswiki/
  • French Wikipedia dumps: http://download.wikimedia.org/frwiki/
  • German Wikipedia dumps: http://download.wikimedia.org/dewiki/

Some other directories (e.g. simple, nostalgia) exist, with the same structure.

Latest complete dump of english wikipedia

As of 17 January 2009  ( 2009 -01-17 ) , it seems that all snapshots of pages-meta-history.xml.7z hosted at http://download.wikipedia.org/enwiki/ are missing. The developers at Wikimedia Foundation are working to address this issue (http://lists.wikimedia.org/pipermail/wikitech-l/2009-January/040841.html). There are other ways to obtain this file:

  • Internet Archive: http://www.archive.org/details/enwiki-20080103
  • From this unofficial mirror: http://jeffkubina.org/data/wikipedia/dumps/enwiki
  • Non-authoritative md5sum of the file is found at this page: http://www.jeffkubina.org/data/wikipedia/dumps/enwiki/200801/enwiki-20080103-md5sums.txt

Warning: the file enwiki-20080103-pages-meta-history.xml.7z decompresses to 2.8 Terabytes . Before wasting Internet Archive's bandwidth, ask yourself: do you really have enough hard disk space to work on this file? Can't you use Wikipedia's API instead and work on a small random sample of the dataset?

Images and uploaded files

Currently Wikipedia does not allow or provide facilities to download all images. As of 17 May 2007  ( 2007 -05-17 ) , Wikipedia disabled or neglected all viable bulk downloads of images including torrent trackers. Therefore, there is no way to download image dumps other than scraping Wikipedia pages up or using Wikix, which converts a database dump into a series of scripts to fetch the images.

Unlike most article text, images are not necessarily licensed under the GFDL & CC-BY-SA-3.0. They may be under one of many free licenses, in the public domain, believed to be fair use, or even copyright infringements (which should be deleted). In particular, use of fair use images outside the context of Wikipedia or similar works may be illegal. Images under most licenses require a credit, and possibly other attached copyright information. This information is included in image description pages, which are part of the text dumps available from download.wikimedia.org. In conclusion, download these images at your own risk (Legal)

Dealing with large files

You may run into problems downloading files of unusual size. Some older operating systems, file systems, and web clients have a hard limit of 2GB on file size. If you seem to be hitting this limit, try using wget version 1.10 or greater, cURL version 7.11.1-1 or greater, or a recent version of lynx (using -dump).

It is recommended that you check the MD5 sums (provided in a file in the download directory) to make sure your download was complete and accurate. You can check this by running the "md5sum" command on the files you downloaded. Given how large the files are, this may take some time to calculate. Due to the technical details of how files are stored, file sizes may be reported differently on different filesystems, and so are not necessarily reliable. Also, you may have experienced corruption during the download, though this is unlikely.

The file size limits for the various file systems are as follows:

  • FAT16 (MS-DOS version 6, Windows 3.1, and earlier) supports files up to 2GB.
  • FAT32/VFAT (Windows 95, 98, 98SE, and ME) supports files up to 4GB.
  • ext2 and ext3 filesystems can handle 16GB files and larger, depending on your block size. See http://www.suse.com/~aj/linux_lfs.html for more information.
  • HFS Plus (Mac OS X 10.2+) and XFS both support files up to 8 exabytes.
  • NTFS (Windows NT 3.51+, 2000, XP, Server 2003, and Vista) supports up to 16 exabytes.

Many standard programming libraries and functions may also cause problems when accessing large files. For example, the standard C function, fopen, limits file sizes to 2GB on 32-bit systems. This is due to it using signed 32-bit integers, limiting file pointers to 2^31 bits (2GB).

Why not just retrieve data from wikipedia.org at runtime?

Suppose you are building a piece of software that at certain points displays information that came from wikipedia. If you want your program to display the information in a different way than can be seen in the live version, you'll probably need the wikicode that is used to enter it, instead of the finished HTML.

Also if you want to get all of the data, you'll probably want to transfer it in the most efficient way that's possible. The wikipedia.org servers need to do quite a bit of work to convert the wikicode into html. That's time consuming both for you and for the wikipedia.org servers, so simply spidering all pages is not the way to go.

To access any article in XML, one at a time, access Special:Export/Title of the article.

Read more about this at Special:Export.

Please be aware that live mirrors of Wikipedia that are dynamically loaded from the Wikimedia servers are prohibited. Please see Wikipedia:Mirrors and forks.

Please do not use a web crawler

Please do not use a web crawler to download large numbers of articles. Aggressive crawling of the server can cause a dramatic slow-down of Wikipedia. Our robots.txt blocks many ill-behaved bots.

Sample blocked crawler email


Note that the robots.txt currently has a commented out Crawl-delay:

                        ## *at least* 1 second please. preferably more :D## we're disabling this experimentally 11-09-2006#Crawl-delay: 1
                      

Please be sure to use an intelligent non-zero delay regardless.

Doing SQL queries on the current database dump

You can do SQL queries on the current database dump (as a replacement for the disabled Special:Asksql page) . For more information about this service, see de:Benutzer:Filzstift/wikisign.org (in German only).

Dealing with compressed files

Approximate file sizes are given for the compressed dumps; uncompressed they'll be significantly larger.

Some older archives are compressed with gzip, which is compatible with PKZIP (the most common Windows format). Newer archives are available in both bzip2 and 7zip compressed formats.

Windows users may not have a bzip2 decompressor on hand; a command-line Windows version of bzip2 (from here) is available for free under a BSD license.

The LGPL'd GUI file archiver, 7-zip , is also able to open bz2 compressed files, and is available for free.

MacOS X ships with the command-line bzip2 tool.

Please note that older versions of bzip2 may not be able to handle files larger than 2GB, so make sure you have the latest version if you experience any problems.

Database schema

SQL schema

The sql file used to initialize a MediaWiki database can be found here.

XML schema

The XML schema for each

Permanser

Click on the link below to download a Resume Format to help you write up your own resume. Download Resume Format

...

Text format resume tips | Resume Town

Free Resume Templates Download ... pattern of Biography, complex pattern is too much, I find the text format ...

...

Download Resume Formats and Templates - Career Resources - Recruiter ...

Your one-stop connection to recruiters across North America and the World. View recruiting web sites for several different professions and industries. Start your career search at ...

...

Sample Resume - impeccablejobs.com

Recruiters from anywhere call us +919212578877 ... View & Download : View & Download : View & Download ... Extensive Experience Resume Format -

...

Resume Town - Free Resume Templates Download - Part 2

and competence for retail industries’s intense competition. This example will teach you how to do. « Click here Download retail industry resume template (Word format)

...

Resume Formats

Resume formats for functional, chronological, and comabination resumes. Each has one and two ... complete with pages of actual text. Click the icon to either view online or download.

...

Best free resume format downloads. Resume builder program with over 25 ...

Picture to icon is an application to help you create icons from your own pictures immediately.

...

Resume Format : Resume Formats : Sample Resumes :Resume Writing Format

I want to instantly download 350+ professional resume formats and cover letter templates for only $29.95 $4 ...

...

Resume Formats

8 Free Resume Templates Eight, free resume templates that you may download. Resume formats include chronological, functional, combination (hybrid) and technical in two different ...

...

Professional Resume Formats and Cover Letters

I want to instantly download 350+ professional resume formats and cover letter templates for only $29.95 $4.99.

...