Discussion:
[Xmldatadumps-l] New mirror of 'other' datasets
Ariel Glenn WMF
2016-05-04 12:33:31 UTC
Permalink
I'm happy to announce a new mirror for datasets other than the XML dumps.
This mirror comes to us courtesy of the Center for Research Computing,
University of Notre Dame, and covers everything "other" [1] which includes
such goodies as Wikidata entity dumps, pageview counts, titles of all files
on each wiki (daily), titles of all articles of each wiki (daily), and the
so-called "adds-changes" dumps, among other things. You can access it at
http://wikimedia.crc.nd.edu/other/ so please do!

Ariel

[1] https://dumps.wikimedia.org/other/
Federico Leva (Nemo)
2016-05-15 16:21:10 UTC
Permalink
You can access it at http://wikimedia.crc.nd.edu/other/ so please do!
Great news, especially because it's ten times faster than
dumps.wikimedia.org! Finally, every time I need a dataset to quickly
verify a sudden idea I have, the download becomes a matter of minutes
rather than hours.

Nemo
Ariel Glenn WMF
2016-06-17 11:21:39 UTC
Permalink
Dear all,

The server hosting this service has been moved to a different network, and
as such, it is now "only accessible/routable from select (still many)
members of Internet2 (U.S. universities), ESnet (U.S. national labs), and
Geant in Europe. This restricted list of places is currently limited, but
is continually growing", as email from our contact at that mirror says.
For folks from specific institutions that suddenly no longer have access, I
can forward instution names along and hope that helps.

Ariel
Post by Ariel Glenn WMF
I'm happy to announce a new mirror for datasets other than the XML dumps.
This mirror comes to us courtesy of the Center for Research Computing,
University of Notre Dame, and covers everything "other" [1] which includes
such goodies as Wikidata entity dumps, pageview counts, titles of all files
on each wiki (daily), titles of all articles of each wiki (daily), and the
so-called "adds-changes" dumps, among other things. You can access it at
http://wikimedia.crc.nd.edu/other/ so please do!
Ariel
[1] https://dumps.wikimedia.org/other/
Federico Leva (Nemo)
2016-06-17 12:59:11 UTC
Permalink
Post by Ariel Glenn WMF
For folks from specific institutions that suddenly no longer have
access, I can forward instution names along and hope that helps.
It would be nice to whitelist the wmflabs.org servers, which would
benefit from a faster server to download this stuff from.

Nemo
Federico Leva (Nemo)
2016-09-27 09:34:23 UTC
Permalink
Post by Federico Leva (Nemo)
Post by Ariel Glenn WMF
For folks from specific institutions that suddenly no longer have
access, I can forward instution names along and hope that helps.
It would be nice to whitelist the wmflabs.org servers, which would
benefit from a faster server to download this stuff from.
Did this prove impossible? I need mediacounts data on a Labs server now,
and it would take days do download from dumps.wikimedia.org.

Nemo
Ariel Glenn WMF
2016-09-27 09:47:57 UTC
Permalink
I got nothing back from my email so I assume that means it's not happening.

http://dumps.wikimedia.your.org/other/mediacounts/daily/2016/ There are
mediacounts here, is the download speed acceptable?

Ariel
Post by Federico Leva (Nemo)
Post by Federico Leva (Nemo)
Post by Ariel Glenn WMF
For folks from specific institutions that suddenly no longer have
access, I can forward instution names along and hope that helps.
It would be nice to whitelist the wmflabs.org servers, which would
benefit from a faster server to download this stuff from.
Did this prove impossible? I need mediacounts data on a Labs server now,
and it would take days do download from dumps.wikimedia.org.
Nemo
_______________________________________________
Xmldatadumps-l mailing list
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Federico Leva (Nemo)
2016-09-27 10:13:25 UTC
Permalink
Ok.
Post by Ariel Glenn WMF
http://dumps.wikimedia.your.org/other/mediacounts/daily/2016/ There
are mediacounts here, is the download speed acceptable?
Oh yes, that's around 50 MiB/s. I did not see this directory linked from
their main page so I thought they had removed it; I'll add the link from
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps

Nemo
Ariel Glenn WMF
2016-09-27 10:39:53 UTC
Permalink
Thanks, that's great.

Ariel
Post by Federico Leva (Nemo)
Ok.
Post by Ariel Glenn WMF
http://dumps.wikimedia.your.org/other/mediacounts/daily/2016/ There
are mediacounts here, is the download speed acceptable?
Oh yes, that's around 50 MiB/s. I did not see this directory linked from
their main page so I thought they had removed it; I'll add the link from
https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps
Nemo
Continue reading on narkive:
Loading...