#googledatatakeout — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #googledatatakeout, aggregated by home.social.
-
Here's your periodic reminder to #backup your #SocialMedia profiles by using their relevant #DataLiberation options. :)
For #Mastodon that is usually at https://your.instance.example/settings/export
#Twitter has an export option at https://twitter.com/settings/download_your_data
#Google's services are just about all consolidated at their #GoogleDataTakeout at https://takeout.google.com/
#Instagram's available at https://www.instagram.com/download/request/ -
In an attempt to make it easier to go through my old #GooglePlus posts to see which ones I want to repost on my #blog as #ArchivedContent, I've picked up #coding on #PlexodusTools again, and started adding #DataMassaging scripts for sorting and filtering content.
One of the first steps towards this, is discovering the topics I posted the most about, so here are the 50 most common hashtags from my #GPlus archive as taken from #GoogleDataTakeout:
https://gist.github.com/FiXato/464fb97df38a646955621eef74449fda -
In an attempt to make it easier to go through my old #GooglePlus posts to see which ones I want to repost on my #blog as #ArchivedContent, I've picked up #coding on #PlexodusTools again, and started adding #DataMassaging scripts for sorting and filtering content.
One of the first steps towards this, is discovering the topics I posted the most about, so here are the 50 most common hashtags from my #GPlus archive as taken from #GoogleDataTakeout:
https://gist.github.com/FiXato/464fb97df38a646955621eef74449fda -
The more I look at this old #GooglePlus #GoogleDataTakeout archive, the more I spot bugs/issues I missed during the #GooglePlusExodus period.
Of course nothing can be done about it anymore now, but it might be a warning for #DataLiberation of data from other #Google services.Take for instance this #JSON file from my #GPlus Stream #GoogleTakeout archive: https://gist.github.com/FiXato/6edb8af605d0f6c0febe5fc94aeab087
And compare it to the related archived post on #WayBackMachine: http://web.archive.org/web/20190308043152/https://plus.google.com/112064652966583500522#author:i34 -
Looking at the source JSON from my #GooglePlus #GoogleDataTakeout archive, there's no firm indicator that this was a repost though... So, great, now I need to second-guess the origin of all my archived posts...
The only reference is a reference to Sean's user account in the postACL, but that can also happen with a regular plus-mention.
Hopefully my own scraped archive will be more useful...
-
Ugh... downloading multiple large archives from #GoogleDataTakeout is /TEDIOUS/... I don't dare download more than 2 at a time out of fear all of them will abort around 80-90%, but that also means that every time I queue up two more, I need to enter my password again...
I wish I could just use pubkey authentication and download these via rsync over SSH...
-
Oh how useful... the #GoogleDataTakeout JSON files seem to contain references to https://video-downloads.googleusercontent.com/ files that give a HTTP 400 error, possibly because those URLs tend to expire in a short period?
-
@jbond have you also noticed that a lot of the .localFilePath items for .album .media[] items in Google+ Stream/Posts seem to be pointing to files that are not actually there?
#GoogleDataTakeout #GooglePlus