AernaLingus

joined 2 years ago
[–] [email protected] 6 points 1 week ago

The original French quote appears to be from here. Stories link to another tweet (since privated) as the source of the translation, which quotes the first tweet, but it only differs from the embedded Google Translate result by a single word ("tasty" vs. "satisfying"). Here's a video of the press conference with more context and a similar translation of that quote.

[–] [email protected] 5 points 2 weeks ago

Whoa, that looks pretty sick. Definitely will give it a shot next time the need arises!

[–] [email protected] 6 points 2 weeks ago

David P. Goldman is deputy editor of Asia Times and a fellow of the Claremont Institute’s Center for the American Way of Life.

What a fash-coded name for a think tank. Might as well call it the Center for Securing the Existence of Our People and a Future for White Children.

[–] [email protected] 4 points 2 weeks ago

In text form:

Abstract

Amid the current U.S.-China technological race, the U.S. has imposed export controls to deny China access to strategic technologies. We document that these measures prompted a broad-based decoupling of U.S. and Chinese supply chains. Once their Chinese customers are subject to export controls, U.S. suppliers are more likely to terminate relations with Chinese customers, including those not targeted by export controls. However, we find no evidence of reshoring or friend-shoring. As a result of these disruptions, affected suppliers have negative abnormal stock returns, wiping out $130 billion in market capitalization, and experience a drop in bank lending, profitability, and employment.

Quote from conclusion

Moreover, the benefits of U.S. export controls, namely denying China access to advanced technology, may be limited as a result of Chinese strategic behavior. Indeed, there is evidence that, following U.S. export controls, China has boosted domestic innovation and self-reliance, and increased purchases from non-U.S. firms that produce similar technology to the U.S.-made ones subject to export controls.

[–] [email protected] 1 points 3 weeks ago* (last edited 3 weeks ago)

Yeah I know that feeling, I posted and add unnecessary noise to Phil Harvey's forum about something I though was a "bug" or odd behavior with EXIF-tool, while it's was just my lacking reading skills... I felt so dumb :/

Happens to the best of us! As long as you make a genuine effort to find a solution, I think most people will be happy to help regardless.

As for the version of the unique name code you wrote, you got the spirit! The problem is that the try block will only catch the exception the first time around, so if there are two duplicates the uncaught exception will terminate the script. Separately, when working with exceptions it's important to be mindful of which particular exceptions the code in the try block might throw and when. In this case, if the move is to another directory in the same filesystem, shutil.move will match the behavior of os.rename which throws different types of exceptions depending on what goes wrong and what operating system you're on. Importantly, on Windows, it will throw an exception if the file exists, but this will not generally occur on Unix and the file will be overwritten silently.

(actually, I just realized that this may be an issue with pasting in your Python code messing up the indentation--one of the flaws of Python. If this was your actual code, I think it would work:)

        try:
          shutil.move(d['SourceFile'], subdirectory)
        except:
          i = 0
          while os.path.exists(d['SourceFile']):
            i += 1
            base_name, extension = os.path.splitext(d['SourceFile'])
            new_filename = f"{base_name}-{i}{extension}"
          print(new_filename)
          os.rename(d['SourceFile'], new_filename)
          shutil.move(new_filename, subdirectory)

(oh, and I should have mentioned this earlier, but: for Markdown parsers that support it (including Lemmy and GitHub) if you put the name of the language you're writing in after your opening triple ` (e.g. ```python or ```bash) it'll give you syntax highlighting for that language (although not as complete as what you'd see in an actual code editor))

Really cool that you figured out how to do it with exiftool natively--I'll be honest, I probably wouldn't have persevered enough to come up with that had it been me! Very interesting that it ended up being slower than the Python script, which I wouldn't have expected. One thing that comes to mind is that my script more or less separates the reads and writes: first it reads all the metadata, then it moves all the files (there are also reads to check for file existence in the per-file operations, but my understanding is that this happens in compact contiguous areas of the drive and the amount of data read is tiny). If exiftool performs the entire operation for one file at a time, it might end up being slower due to how storage access works.


Happy to have been able to help! Best of luck to you.

[–] [email protected] 2 points 3 weeks ago

I went through a phase in my late teens/early 20s where I had major bladder shyness. There were a few times in especially high pressure situations (e.g. right after a movie or during a break in a football game) where I just stood there for 30 seconds with no results and was like "welp I guess I'm just gonna have to hold it until I get home." I honestly don't think I had any major psychological shift, since I was and still am majorly anxious, but thankfully it waned over time and I can now piss in peace.

[–] [email protected] 2 points 3 weeks ago (2 children)

Wow, nice find! I was going to handle it by just arbitrarily picking the first tag which ended with CreateDate, FileModifyDate, etc., but this is a much better solution which relies on the native behavior of exiftool. I feel kind of silly for not looking at the documentation more carefully: I couldn't find anything immediately useful when looking at the documentation for the class used in the script (ExifToolHelper) but with the benefit of hindsight I now see this crucial detail about its parameters:

All other parameters are passed directly to the super-class constructor: exiftool.ExifTool.__init__()

And sure enough, that's where the common_args parameter is detailed which handles this exact use case:

common_args (list of str*, or* None.) –

Pass in additional parameters for the stay-open instance of exiftool.

Defaults to ["-G", "-n"] as this is the most common use case.

  • -G (groupName level 1 enabled) separates the output with groupName:tag to disambiguate same-named tags under different groups.

  • -n (print conversion disabled) improves the speed and consistency of output, and is more machine-parsable

Passed directly into common_args property.

As for the renaming, you could handle this by using os.path.exists as with the directory creation and using a bit of logic (along with the utility functions os.path.basename and os.path.splitext) to generate a unique name before the move operation:

# Ensure uniqueness of path
basename = os.path.basename(d['SourceFile'])
filename, ext = os.path.splitext(basename)
count = 1        
while os.path.exists(f'{subdirectory}/{basename}'):
  basename = f'{filename}-{count}{ext}'
  count += 1

shutil.move(d['SourceFile'], f'{subdirectory}/{basename}')
[–] [email protected] 2 points 3 weeks ago* (last edited 3 weeks ago) (5 children)

Alright, here's what I've got!

#!/usr/bin/env python3

import datetime
import glob
import os
import re
import shutil

import exiftool


files = glob.glob(r"/path/to/photos/**/*", recursive=True)
# Necessary to avoid duplicate files; if all photos have the same extension
# you could simply add that extension to the end of the glob path instead
files = [f for f in files if os.path.isfile(f)]

parent_dir = r'/path/to/sorted/photos'
start_date = datetime.datetime(2015, 1, 1)
end_date = datetime.datetime(2024, 12, 31)
date_extractor = re.compile(r'^(\d{4}):(\d{2}):(\d{2})')

with exiftool.ExifToolHelper() as et:
  metadata = et.get_metadata(files)
  for d in metadata:
    for tag in ["EXIF:DateTimeOriginal", "EXIF:CreateDate",
                "File:FileModifyDate", "EXIF:ModifyDate",
                "XMP:DateAcquired"]:
      if tag in d.keys():
        # Per file logic goes here
        year, month, day = [int(i) for i in date_extractor.match(d[tag]).group(1, 2, 3)]
        filedate = datetime.datetime(year, month, day)
        if filedate < start_date or filedate > end_date:
          break
        
        # Can uncomment below line for debugging purposes
        # print(f'{d['File:FileName']} {d[tag]} {year}/{month}')
        subdirectory = f'{parent_dir}/{year}/{month}'
        if not os.path.exists(subdirectory):
          os.makedirs(subdirectory)

        shutil.move(d['SourceFile'], subdirectory)
        
        break

Other than PyExifTool which will need to be installed using pip, all libraries used are part of the standard library. The basic flow of the script is to first grab metadata for all files using one exiftool command, then for each file to check for the existence of the desired tags in succession. If a tag is found and it's within the specified date range, it creates the YYYY/MM subdirectory if necessary, moves the file, and then proceeds to process the next file.

In my preliminary testing, this seemed to work great! The filtering by date worked as expected, and when I ran it on my whole test set (831 files) it took ~6 seconds of wall time. My gut feeling is that once you've implemented the main optimization of handling everything with a single execution of exiftool, this script (regardless of programming language) is going to be heavily I/O bound because the logic itself is simple and the bulk of time is spent reading and moving files, meaning your drive's speed will be the key limiting factor. Out of those 6 seconds, only half a second was actual CPU time. And it's worth keeping in mind that I'm doing this on a speedy NVME SSD (6 GB/s sequential read/write, ~300K IOPS random read/write), so it'll be slower on a traditional HDD.

There might be some unnecessary complexity for some people's taste (e.g. using the datetime type instead of simple comparisons like in your bash script), but for something like this I'd prefer it to be brittle and break if there's unexpected behavior because I parsed something wrong or put in nonsensical inputs rather than fail silently in a way I might not even notice.

One important caveat is that none of my photos had that XMP:DateAcquired tag, so I can't be certain that that particular tag will work and I'm not entirely sure that will be the tag name on your photos. You may want to run this tiny script just to check the name and format of the tag to ensure that it'll work with my script:

#!/usr/bin/env python3

import exiftool
import glob
import os


files = glob.glob(r"/path/to/photos/**/*", recursive=True)
# Necessary to avoid duplicate files; if all photos have the same extension
# you could simply add that extension to the end of the glob path instead
files = [f for f in files if os.path.isfile(f)]
with exiftool.ExifToolHelper() as et:
  metadata = et.get_metadata(files)
  for d in metadata:
    if "XMP:DateAcquired" in d.keys():
      print(f'{d['File:FileName']} {d[tag]}')

If you run this on a subset of your data which contains XMP-tagged files and it correctly spits out a list of files plus the date metadata which begins YYYY:MM:DD, you're in the clear. If nothing shows up or the date format is different, I'd need to modify the script to account for that. In the former case, if you know of a specific file that does have the tag, it'd be helpful to get the exact tag name you see in the output from this script (I don't need the whole output, just the name of the DateAcquired key):

#!/usr/bin/env python3

import exiftool
import json


with exiftool.ExifToolHelper() as et:
  metadata = et.get_metadata([r'path/to/dateacquired/file'])
  for d in metadata:
    print(json.dumps(d, indent=4))

If you do end up using this, I'll be curious to know how it compares to the parallel solution! If the exiftool startup time ends up being negligible on your machine I'd expect it to be similar (since they're both ultimately I/O bound, and parallel saves time by being able to have some threads executing while others are waiting for I/O), but if the exiftool spin-up time constitutes a significant portion of the execution time you may find it to be faster! If you don't end up using it, no worries--it was a fun little exercise and I learned about a library that will definitely save me some time in the future if I need to do some EXIF batch processing!

[–] [email protected] 1 points 4 weeks ago (7 children)

Yeah, I think the fact that you need to capture the output and then use that as input to another exiftool command complicates things a lot; if you just need to run an exiftool command on each photo and not worry about the output I think the -stay_open approach would work, but I honestly have no idea how you would juggle the input and output in your case.

Regardless, I'm glad you were able to see some improvement! Honestly, I'm the wrong person to ask about bash scripts, since I only use them for really basic stuff. There are wizards who do all kinds of crazy stuff with bash, which is incredibly useful if you're trying to create a portable tool with no dependencies beyond any binaries it may call. But personally, if I'm just hacking myself together something good enough to solve a one-off problem for myself I'd rather reach for a more powerful tool like Python which demands less from my puny brain (forgive my sacrilege for saying this in a Bash community!). Here's an example of how I might accomplish a similar task in Python using a wrapper around exiftool which allows me to batch process all the files in one go and gives me nice structured data (dictionaries, in this case) without having to do any text manipulation:

import exiftool
import glob

files = glob.glob(r"/path/to/photos/**/*", recursive=True)
with exiftool.ExifToolHelper() as et:
  metadata = et.get_metadata(files)
  for d in metadata:
    for tag in ["EXIF:DateTimeOriginal", "EXIF:CreateDate", "File:FileCreateDate", "File:FileModifyDate", "EXIF:DateAcquired"]:
      if tag in d.keys():
        # Per file logic goes here
        print(f'{d["File:FileName"]} {d[tag]}')
        break

This outline of a script (which grabs the metadata from all files recursively and prints the filename and first date tag found for each) ran in 4.2 s for 831 photos on my machine (so ~5 ms per photo).

Since I'm not great in bash and not well versed in exiftool's options, I just want to check my understanding: for each photo, you want to check if it's in the specified date range, and then if it is you want to copy/move it to a directory of the format YYYYMMDD? I didn't actually handle that logic in the script above, but I showed where you would put any arbitrary operations on each file. If you're interested, I'd be happy to fill in the blank if you can describe your goal in a bit more detail!

[–] [email protected] 2 points 4 weeks ago (9 children)

Without looking too carefully at your script, I strongly suspect the issue is the time it takes to spin up an exiftool process once per file (not ideal for any program, but I find that exiftool is especially slow to start).

I have never done this myself, but apparently there's an option -stay_open which allows you to start up exiftool once and have it continually read arguments from a file or pipe. If you were to write your script around this flag I suspect it would be much faster. Here's a brief forum discussion which may prove useful.

If you decide to do that, let me know how it goes! Might use it myself should the need arise.

[–] [email protected] 6 points 1 month ago

If you're watching consecutive episodes of a series you can always just download them to your phone before you head to work. Not really viable if you hop around a lot, though.

[–] [email protected] 3 points 1 month ago

Seconded, that beast (well, one of its predecessors) got me through college on the included toner cartridge alone and it's still kicking

 

Channel is full of bangers (some of my favorites are The Hustle and Microservices)

view more: next ›