User talk:Schlurcher
|
ISO speed (sensor value scaling) for Canon PowerShot A430 compact digital camera
[edit]Hello. I have uploaded a few old photos taken by a Canon PowerShot A430 compact digital camera. SchlurcherBot tagged one of them with the exposure time (virtual shutter speed) taken from the metadata in the file. However, the bot did not tag the same image with its ISO speed (virtual sensor sensitivity, how the raw sensor values are scaled to get the color values in the jpeg image file).
You can find the ISO speed for this image with the exiftool program, with a command like exiftool -ISO Switzerland_glacier_from_above.jpg
(assuming you downloaded the image to that filename). Note that the ISO speed may not be trivial to get from this file, I believe in this case exiftool computes it from two different pieces of camera-specific metadata in the file: see notes for the Canon:AutoISO tag and the Composite:ISO tag. You can use a command like exiftool -G1 -s -"*ISO*" Switzerland_glacier_from_above.jpg
to see both the original metadata and the computed value together.
From what I can tell, SchlurcherBot does add the ISO speed from file metadata into wikidata for some other photos from a different camera, even when the camera was set to select the ISO speed automatically, see eg. this photo taken with a Panasonic camera or this one taken with a Nikon. By analogy I think it may make sense for the bot to add the ISO speed for the Canon camera too. I don't consider this a bug, since the bot need not recognize all possible metadata in all image files, I just want to make sure that you know about this behavior.
I'll note that User:Emijrpbot behaves exactly the same as SchlurcherBot here.
– b_jonas 15:00, 17 October 2024 (UTC)
- Hi! Thanks for letting me know. I currently do get the ISO speed from the Wikimedia API, i.e. from the exposed metadata. That's a subset of all. I'm currently not downloading the actual file as this was not needed. So, I guess there is no trivial solution. --Schlurcher (talk) 09:53, 18 October 2024 (UTC)
- Oh wow! I assumed that you're just downloading the files and extracting the metadata. I should look up how the Mediawiki software is getting the metadata then. Thank you for the answer. – b_jonas 10:09, 18 October 2024 (UTC)
- This is the exact API call [1]. If for the files missing ISO one of the other attributes contains that information, I can easily fix that. Could you please also name such a file? Thanks Schlurcher (talk) 11:07, 18 October 2024 (UTC)
- Oh wow! I assumed that you're just downloading the files and extracting the metadata. I should look up how the Mediawiki software is getting the metadata then. Thank you for the answer. – b_jonas 10:09, 18 October 2024 (UTC)
- Thank you, the information about the exact API call helps me debug this. I tried this Imageinfo API call on two images with that Canon camera: [2] [3], and the information that it returns indeed contains the exposure time but not the shutter speed. – b_jonas 11:14, 19 October 2024 (UTC)
The meta data on camera settings for any photograph from before 1990 or so is the details of the scanner, not of the photograph. As such, this is creating a lot of bad data when applied to images such as this. Can I suggest cutting out anything with dates before 1990 from this sort of metadata copying? Adam Cuerden (talk) 10:55, 21 October 2024 (UTC)
Timetravel
[edit]I'm wondering if the value at [9] shouldn't be a different one. Supposedly that camera was used for copying the photo (thus in the exif).
∞∞ Enhancing999 (talk) 22:53, 29 October 2024 (UTC)
- Yes, these cases are tricky. I've already coded some conditions to avoid as much as possible. Any further idears? --Schlurcher (talk) 10:17, 30 October 2024 (UTC)
- Any file uploaded by Swiss_National_Library and ETH-Bibliothek is generally a digitalization/not digitally native.
∞∞ Enhancing999 (talk) 10:50, 30 October 2024 (UTC) - To find more: Special:Search/haswbstatement:P4082 CH-NB
∞∞ Enhancing999 (talk) 07:34, 1 November 2024 (UTC)
- Any file uploaded by Swiss_National_Library and ETH-Bibliothek is generally a digitalization/not digitally native.
Regex question.
[edit]Hello Schlurcher, I'm trying to figure out how to move 35 thousand files across 3k catagories. The categories all have the format: (type of art)by (artist) in [the] Statens Museum for Kunst. The catagories the files are in right now have the word in brackets in the title, the target cats dont. Is there a way todo this with regex and pywikibot? All the Best -- Chuck Talk 04:21, 31 October 2024 (UTC)
- script is at [10]. Your thoughts are appreciated. All the Best -- Chuck Talk 04:54, 31 October 2024 (UTC)
- @Alachuckthebuck: . Thanks for reaching out. For such a case, I would not program a custom script, but use the standard replace.py [11] with a custom addition to user-fixes.py. That has the benefit that the replace.py script takes care of all the error handling and the maxlag, etc., and it's simply more reliant.
- Your user-fixes.py would have something like this:
fixes['SchlurcherBot3-16'] = {
'regex': True,
'recursive': False,
'nocase': True,
'msg': {
'_default':u'Perform category move',
},
'replacements': [
(r'(\[\[category:[^]]+\]\])(.+?) by (.+?) in the Statens Museum for Kunst\]\]',
r'\1\2 by \3 in Statens Museum for Kunst]]')
],
'exceptions': {
'inside-tags': [
'nowiki',
'comment',
'math',
'pre'
]
}
}
You can add multiple replacements (r, r)
separated by a comma in there if a single one does not do the trick. Gemini [12] is usually quite good at constructing the Regex. Your command line call would look like this (called from the pywikibot core folder)
python3 pwb.py replace.py -file:"..." -namespace:6 -fix:SchlurcherBot3-16 -always -regex -nocase -dotall -multiline -putthrottle:2 -log:"..." -simulate
- All this is untested. Once you confirmed that the edits look good, remove the -simulate. Hope this gives you something to work with. --Schlurcher (talk) 08:10, 31 October 2024 (UTC)
- If I were to run this from the hypothetical bot Chuckbot, is it just a drop in replace? All the Best -- Chuck Talk 16:17, 31 October 2024 (UTC)
- You run the login script to login from your bot and than that script. There is no need to do any actual python coding for this. --Schlurcher (talk) 18:45, 31 October 2024 (UTC)
- Thank you so much! All the Best -- Chuck Talk 20:49, 31 October 2024 (UTC)
- You run the login script to login from your bot and than that script. There is no need to do any actual python coding for this. --Schlurcher (talk) 18:45, 31 October 2024 (UTC)
- If I were to run this from the hypothetical bot Chuckbot, is it just a drop in replace? All the Best -- Chuck Talk 16:17, 31 October 2024 (UTC)