#jdupe — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #jdupe, aggregated by home.social.
-
On Local Photo Management and the Command Line
Reading Time: 5 minutesPicasa and iPhoto
Picasa and iPhoto were great apps. Both were free. Both allowed you to manage your photos locally and both allowed you to take pictures with photo cameras, or your phone, and sync them then you got home. Over time our phones synced via the cloud to these apps.
We lost the habit of getting home and ingesting photos because everything was done automatically. We took pictures and they appeared in Picasa and iCloud and we didn't think about it too much.
This was the gateway habit that led to some of us having cloud first photo libraries rather than local ones. One of the factors that led to this is that Picasa synced automatically to the cloud, as did iPhoto.
This was excellent because for years our photographs were stored locally and in the cloud. That is, until I libraries grew and became too big for laptop hard drives. At this point we had to choose. Get a laptop with a larger hard drive, delete photos, or spend more on iCloud.At 3 CHF per month spending more on iCloud was an easy choice.
It's later, when the library grew beyond 200GB that Google One became an interesting proposition. You could get two terabytes for 100 CHF per year. That head room is a huge luxury, for as long as you can backup your photos from the cloud back to a local volume.
The issue is that you can't, and you couldn't. It's only recently that I really managed to export my photos from Google Photos and Flickr, and after two weeks of experimenting and learning.
The Issue
Cloud services are great for synching all your photos and videos, as well as all the photos you get from whatsapp, the screenshots and more.They're not good when it comes to re-organising files.
If friends and family share photos via Whatsapp, Signal or you download videos from TikTok or Flickr, they're all combined into your own photos. This makes a lot of noise twice. The first time in iCloud and Google Photos but the second time in your whatsapp history.
If people share photos and videos whatsapp downloads them to its own backup, as well as to your own photo gallery if it allows you to, which I recommend for one reason. Whatsapp has a nasty memory of taking 100MB or more per chat. This noise is from photos, videos, pdfs, gifs and more. You might have a copy in Google Photos, in Apple Photos and potentially Immich, Photoprism and other photo clouds.
Hard to tidy
Google Photos, Apple Photos, Immich and Photoprism are great at automatic cataloguing but not at helping you tidy up the mess they help you create. For a start Immich and Apple photos make a tremendous mess of your photos files and hierarchy if you give them free reign. You go from a neatly organised hierarchy to a machine friendly mess that you need to clean up if you choose to move away from them.
With Apple Photos and Google Photos I find it excruciatingly hard to "spring clean" when storage gets low. With iPhoto I noticed that files are almost immediately backed up to iCloud so that if you migrate to Immich and Photoprism you download an entire library, every time immich or Photoprism crash and need to be repopulated. This often takes a day of keeping the phone's screen on. That's why having a local library is key and why kDrive is a great tool and a better solution
The Local Advantage
With kDrive, as with Google Photos, Photoprism, Immich, iPhoto and others you save your photo to a cloud, but unlike with them, with kDrive you have a hierarchical folder structure that you can download and work on via command line tools for batch operations, or visually for manual tidying tasks.
Exiftool
If your files are fresh from Google Takeout, the immich folder structure or other you can use a command line prompt to reorganise everything chronologically and more.
Jdupe
With Jdupe you can look for duplicates automatically. With Immich I noticed that I had 27,000 duplicates to sort through. In some cases they're triplicates and in other cases the duplicates are thumbnail duplicates. To do this sorting, manually, with the Immich tool would take weeks or months. With Jdupe it takes a few seconds to a few hours depending on how many duplicates there are.
rsync
With rsync you can transfer files between volumes with ease and convenience. The computer does the work in the background, backing up to a local drive, and a remote drive.
The Tailscale Caveat
If you're synching gigabytes of files use the local ip address, rather than tailscale because tailscale will throttle you after a certain amount of data transfer, I suspect. It's also a lot faster to do things locally. If you do sync remotely sync it locally first, and then move the drive to the remote location.
Visual Sorting and Find
While waiting for rsync to complete certain jobs I went through libraries manually and noticed patterns. I asked Gemini to create a command to help move webp, png and mp4 files with one pattern from my photo library to a secondary photo library that I can sort through at another time. In one instance that removed 130 gigabytes of noise.
The Motivating Push
I abandoned iCloud as my Single Source of Truth for my photos when my photos reached more than 200 GB and shifted towards Google Photos. With two gigabytes of storage I enjoyed the luxurious feeling. I enjoyed it until I saw that I could get 6TB for 67 CHF from Infomaniak and that's when I spent a long time migrating off of Google Photos.
They make it very hard because you can't just download a chronological list of folders and files as you can with kDrive.
Almost a Terabyte to Sort
My Apple, Google, and Flick libraries came to almost a terabyte of data, most of it duplicates. Sorting through it by hand would take months. Using the tools above, once I had a workflow prepared, with the tools listed above took days. Now my library is 370-390gb.
And Finally
27,000 Duplicates in Immich
I tried ingesting from mobile phones and an old immich library but in so doing I ended up with 27,000 dulicate pairs that I would have to sort through by hand. This task would take months. By removing all the duplicates, before ingesting into Immich I will save weeks of tedious work.
JDupe and Peace of Mind
My iCloud library hasn't been the single source of truth for years, due to the 200gb limit. For a while Google Photos was, until I downgraded the plan, and then it became a former single source of truth. Now I hope that Flickr will have filled many of the gaps. On a drive or two I have old iPhotolibraries.
If required I can open the package, extract the original. Run exiftool to create a chronological library, and repeat until all my libraries are consolidated, and then I can import them to my main photo library, and ingest them to Immich and Photoprism
Conclusion
With command line tools you can consolidate photo libraries from multiple sources into a single source of truth, and move on. By maintaining this single source of truth, and backing it up to kDrive, Google Drive or even iCloud you ensure that it is complete, and easy for immich, Photoprism, or some other tool to ingest.
#cloud #exiftools #GoogleDrive #iphoto #jdupe #Picasa -
Migrating to kDrive from Flickr, Apple and Google Photo Clouds
Reading Time: 4 minutesAs I write this my consolidated photo album is being uploaded to kDrive, to serve as an offsite backup but the journey to this point took about two weeks, due in part to experimentation and learning to use various tools.
Tools I used
- rsync
- Google Takeout
- Flickr Export
- jdupe
- Gemini
- Euria
- Le Chat, by Mistral
Work Flow
The first step is to request your data from Google Photos via the Google Takeout Tool, the Flickr Export tool for flickr, and to download all your photos locally from Apple Photos before disconnecting the local library from iCloud. Disconnecting Photos from iCloud gives you 30 days to realise you made a terrible mistake and fix it.
Export and organise
The next step is to unzip the Google Taekout files in one place, and the flickr export in another place. You want to keep the tree structure created by the zips for the next part.
Exiftool
Exiftools is a command line tool. Google Takeout and Flickr Export may detach metadata from your photos and add them to json files. Exiftools writes the exif data back into your photo files. If you ask Gemini or other AI solution for help it will provide you with the command you need to use. Request a dry run, and get the dry run to write to a text file to double check that it does what you expect.
Keep the zip files as they are. If you make a mistake it's good to have them on hand. Downloading 50 GB files from Google Takeout takes time.
With Flickr it's even more critical because Flickr generates 2gb files. I created a script to automatically download my 168 files.
Once you are happy that exiftool is behaving as expected you can run the command for real. Both of these steps take time so let them run in the background.
Google Takeout
Google takeout generates albums in three key ways, by individual names if you used face recognition, event name if you created an album, and by year, automatically. You will have two to three copies of some photos. In some directories you will only find json files.
When exiftool has run you can backup or delete the json files. If you have the zip files, then you're safe.
Flickr
When I expanded the Flickr zips it created a monoolithic directory with all the photos. I ran exiftools to marry json data with the photos.
Apple Photos
If you want to extract photos from Apple Photos quickly the quickest solution is to right click, show package contents, navigate to originals, and copy photos to another directory. You will need to use exiftool to create a directory where they are sorted by year, month day, and then you can run jdupe and add them to your main library.
Looking for Duplicates and Creating Chronological Libraries
With the data added by Exiftool we can now organise the photos chronologically. The issue is that we have event photos in albums, and the same event photos in the year folder. That's where jdupe comes in. It allows us to automatically compare photos within a directory before removing the duplicate copies.
Once this is done we can organise all the photos chronologically. This makes comparing photos much easier. It also adds a human accessible way of organising photos by year, month and day.
We repeat this step for Google Takeout and Flickr so that we end up with two clean chronological libraries.
The next step is to run jdupe again. This time we're comparing Flickr to Google Photos. The reason for this is that in an ideal world we have a perfect mirror, with both libraries being complete. In reality we might have interrupted payment to flickr, or Google photos so we have gaps. That's why we look for duplicates, before merging unique photos into our main photo library.
Tools such as rsync will help you merge the two libraries into the main library, as well as backup the clean library to a second hard drive on an external hard drive or on another device.
The kDrive migration
If you have not already done so, install the kDrive app and log in. Open the app and navigate to your library's folder and tell kdrive to sync the folder. It will then start copying the data to your cloud. Now you wait for it to be done.
Cleanup and Looking Forward
Once the main library is synced to kDrive I can delete two photos folders from kDrive and my local machine. I can tell kDrive on my phone to sync to the new library folder on kDrive.
That Synching Feeling
For now:
- Photosync adds photos to photoprism
- immich app adds photos to Immich
- kDrive app uploads to kDrive storage
Photoprism and Immich Watching
Both Photoprism and Immich allow you to watch an import folder(photoprism) or external library (immich). If you set the main library as a watch folder then new photos uploaded to kdrive will be added to the main library, and photoprism and Immich will add them to their own libraries. Unselect the "move" option to keep the chronological library intact.
And Finally
With jdupe, exiftool and rsync you can go from having three photo libraries wittled down to just one. You can then tell kdrive desktop to watch and sync that folder. You can use rsync to mirror the library to two or three other drives and filesystems. I have APFS, APFS (case sensitive) and ext4. I also have an offsite backup via kDrive.
#Apple #exiftool #Google #infomaniak #jdupe #kdrive #photos #rsync #takeout -
Migrating to kDrive from Flickr, Apple and Google Photo Clouds
Reading Time: 4 minutesAs I write this my consolidated photo album is being uploaded to kDrive, to serve as an offsite backup but the journey to this point took about two weeks, due in part to experimentation and learning to use various tools.
Tools I used
- rsync
- Google Takeout
- Flickr Export
- jdupe
- Gemini
- Euria
- Le Chat, by Mistral
Work Flow
The first step is to request your data from Google Photos via the Google Takeout Tool, the Flickr Export tool for flickr, and to download all your photos locally from Apple Photos before disconnecting the local library from iCloud. Disconnecting Photos from iCloud gives you 30 days to realise you made a terrible mistake and fix it.
Export and organise
The next step is to unzip the Google Taekout files in one place, and the flickr export in another place. You want to keep the tree structure created by the zips for the next part.
Exiftool
Exiftools is a command line tool. Google Takeout and Flickr Export may detach metadata from your photos and add them to json files. Exiftools writes the exif data back into your photo files. If you ask Gemini or other AI solution for help it will provide you with the command you need to use. Request a dry run, and get the dry run to write to a text file to double check that it does what you expect.
Keep the zip files as they are. If you make a mistake it's good to have them on hand. Downloading 50 GB files from Google Takeout takes time.
With Flickr it's even more critical because Flickr generates 2gb files. I created a script to automatically download my 168 files.
Once you are happy that exiftool is behaving as expected you can run the command for real. Both of these steps take time so let them run in the background.
Google Takeout
Google takeout generates albums in three key ways, by individual names if you used face recognition, event name if you created an album, and by year, automatically. You will have two to three copies of some photos. In some directories you will only find json files.
When exiftool has run you can backup or delete the json files. If you have the zip files, then you're safe.
Flickr
When I expanded the Flickr zips it created a monoolithic directory with all the photos. I ran exiftools to marry json data with the photos.
Apple Photos
If you want to extract photos from Apple Photos quickly the quickest solution is to right click, show package contents, navigate to originals, and copy photos to another directory. You will need to use exiftool to create a directory where they are sorted by year, month day, and then you can run jdupe and add them to your main library.
Looking for Duplicates and Creating Chronological Libraries
With the data added by Exiftool we can now organise the photos chronologically. The issue is that we have event photos in albums, and the same event photos in the year folder. That's where jdupe comes in. It allows us to automatically compare photos within a directory before removing the duplicate copies.
Once this is done we can organise all the photos chronologically. This makes comparing photos much easier. It also adds a human accessible way of organising photos by year, month and day.
We repeat this step for Google Takeout and Flickr so that we end up with two clean chronological libraries.
The next step is to run jdupe again. This time we're comparing Flickr to Google Photos. The reason for this is that in an ideal world we have a perfect mirror, with both libraries being complete. In reality we might have interrupted payment to flickr, or Google photos so we have gaps. That's why we look for duplicates, before merging unique photos into our main photo library.
Tools such as rsync will help you merge the two libraries into the main library, as well as backup the clean library to a second hard drive on an external hard drive or on another device.
The kDrive migration
If you have not already done so, install the kDrive app and log in. Open the app and navigate to your library's folder and tell kdrive to sync the folder. It will then start copying the data to your cloud. Now you wait for it to be done.
Cleanup and Looking Forward
Once the main library is synced to kDrive I can delete two photos folders from kDrive and my local machine. I can tell kDrive on my phone to sync to the new library folder on kDrive.
That Synching Feeling
For now:
- Photosync adds photos to photoprism
- immich app adds photos to Immich
- kDrive app uploads to kDrive storage
Photoprism and Immich Watching
Both Photoprism and Immich allow you to watch an import folder(photoprism) or external library (immich). If you set the main library as a watch folder then new photos uploaded to kdrive will be added to the main library, and photoprism and Immich will add them to their own libraries. Unselect the "move" option to keep the chronological library intact.
And Finally
With jdupe, exiftool and rsync you can go from having three photo libraries wittled down to just one. You can then tell kdrive desktop to watch and sync that folder. You can use rsync to mirror the library to two or three other drives and filesystems. I have APFS, APFS (case sensitive) and ext4. I also have an offsite backup via kDrive.
#Apple #exiftool #Google #infomaniak #jdupe #kdrive #photos #rsync #takeout