#audiodescriptions — Public Fediverse posts on home.social

Jupiter Rowland @[email protected] · 2024-06-28 · 15:40 UTC

@Robert Kingett, blind I don't trust anything generated. At least not with super-obscure niche content like what I post.

And audio descriptions in general are why I'll never publish videos in the Fediverse.

I'd have to go into similar detail as for my pictures, only for moving pictures plus sound plus voice-over now. My descriptions would have to be so detailed that the video would have to pause to let the audio description catch up with the visuals. In fact, the video would spend more time paused while the audio description is rambling than actually moving, and it would never spend more than a few seconds moving at a time.

For one, I would have to describe and explain what the video shows at the very same level of detailed as I describe my images. And at least once I've described one single image at such a level of detail that it'd probably take a screen reader one full hour to read the image description aloud.

Besides, I would have take into account that it's a video. Everything would need timestamps. And instead of only describing the camera position and the camera angle, I would have to describe the camera movements like so:

Seven minutes, eighteen point one three seconds. The camera quickly rotates to the left around a vertical axis through a point roughly two point four metres straight ahead of the avatar. It starts rotating from the direction in which the avatar is facing, roughly twelve degrees to the east of north. The barn which has first appeared at five minutes, fifty-two point two eight seconds comes into view again, including all decoration around it. The camera only rotates around this vertical axis and not around any horizontal axis. The avatar does not rotate with the camera.

Seven minutes, eighteen point six four seconds: The video pauses to let this description catch up.

Seven minutes, eighteen point seven one seconds: The video no longer pauses. The camera reaches a rotation angle of roughly twenty degrees to the south of west. The rotation speed of the camera slows down. It continues to rotate to the left.

Seven minutes, eighteen point nine three seconds: The video pauses to let this description catch up.

Seven minutes, nineteen point zero four seconds: The video no longer pauses. The camera stops rotating at an angle of roughly twenty-five degrees to the west of south.

That is, in order to cater to deaf-blind users, I would have to have two time codes. One, the time code of the original video, not taking the pauses into account. Two, the time code of the described video with catch-up pauses.

And the video with catch-up pauses would be dramatically longer than the original video. Ten minutes of video would take me weeks to describe, probably over a month. And it would end up many hours long, depending on how much there is to describe and explain.

So a time code in the Braille description for deaf-blind users might actually read, "Six minutes, thirty-seven point five five seconds in the original video, fourteen hours, three minutes, forty-nine point two one seconds in this described version of the video."

By the way, no, an AI can't do that.

#Long #LongPost #CWLong #CWLongPost #MediaDescription #MediaDescriptions #AudioDescription #AudioDescriptions

#mediadescriptions #audiodescriptions #mediadescription #audiodescription #cwlongpost #longpost

Jupiter Rowland @[email protected] · 2023-12-29 · 20:57 UTC

@fastfinge @Xantastic "Image description" if it's an image, "audio transcript" or maybe "audio description" if it's audio, "video transcript" or maybe "video description" if it's a video, "media description" as a more general term.

Mastodon only refers to it as "alt-text" because that's where Mastodon users always put it. After all, alt-text gives them 1,500 characters per image, but the toot only gives them 500 characters minus content warnings minus hashtags minus mentions etc.

I use more appropriate terms and more appropriate places to put my descriptions because I don't have to worry about character limits here on Hubzilla.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #ImageDescription #ImageDescriptions #AltText #ImageDescriptionMeta #CWImageDescriptionMeta #MediaDescription #MediaDescriptions #AudioDescription #AudioDescriptions

#cwimagedescriptionmeta #imagedescriptionmeta #imagedescriptions #mediadescriptions #audiodescriptions #imagedescription

Jupiter Rowland @[email protected] · 2023-12-29 · 20:44 UTC

@fastfinge @modulux The question that nobody can agree on an answer to is: How detailed is the minimum requirement for media descriptions? How detailed is optimal? How detailed is too much, and is there such a thing as "too much"?

I'm someone whose "optimal" for image descriptions is probably beyond "too much" for many readers and definitely "too much" for almost all writers, and it keeps getting worse. I can post a 37,000-character description for one image that took me over 13 hours to research and write and find it lacking in multiple ways afterwards. In fact, I've done so.

Now I'm wondering what'd be an optimal audio description for music. Since I'm also a hobbyist musician, I might try to approach describing music in a way that goes into similar detail as sheet music, only that it includes sounds as well. Something that involves describing each audio part separately, although I'm not sure whether I should use the time within the whole audio file, the time only for the song, the bar-based timecode for the song or two or all three of them.

So I guess reading my audio description for one song is likely to take longer than listening to the whole album, if not multiple albums. But it'd be detailed and hopefully informative. That is, if I find a way to describe individual sounds including what effects do to them that's satisfying both for people who have turned deaf and people who were born deaf, both being complete laypeople when it comes to music.

But whether that's the right way, still not sufficient or way overkill, I don't know.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #ImageDescription #ImageDescriptions #AltText #ImageDescriptionMeta #CWImageDescriptionMeta #MediaDescription #MediaDescriptions #AudioDescription #AudioDescriptions

#cwimagedescriptionmeta #imagedescriptionmeta #imagedescriptions #mediadescriptions #audiodescriptions #imagedescription

Jupiter Rowland @[email protected] · 2023-11-04 · 11:57 UTC

CW: Thinking about making videos, dropping it because of accessibility requirements; CW: long (2,940 characters)

A couple of times in the past, I've considered making virtual-world videos once I have a sufficiently powerful graphics card and a screen with a solution that won't make me the laughing stock of the Fediverse. I think I have the graphics card now.

I was going to publish them on PeerTube which is part of the Fediverse in case you don't know yet.

I no longer am.

Fulfilling the accessibility requirements in the Fediverse in a sufficiently informative way would not only be an out-right titanic effort. It would make my videos borderline unwatchable.

I'm someone who posts a picture of a shelf with a few dozen boxes on it and describes it with over 40,000 words. I have to because people wouldn't even get what's in the picture in the first place if I didn't. Imagine what I'd do in a video that shows much more than that. Much much more.

Very early in the video, it'd freeze so I could explain where it would have been made which would take a few minutes. Once everything in-world is visible for the first time, the video would freeze for half an hour or more while I describe and explain absolutely everything within the video frame. Whenever I move or turn, or the camera moves or turns, and something new comes into view, the video would freeze again for several minutes of description and explanation. And so forth.

In fact, the video freezes would end up quite long because I would have to speak slowly, clearly and in Simple English. That wouldn't make my descriptions any shorter, though.

A five-minute clip would be inflated to six hours or more.

The effort to get there would be gargantuan. Even short videos would take me weeks to write the audio description. Then they'd take me some more weeks to re-phrase everything in Simple English. The recording would take several days itself. Of course, I would have to transcribe what I have said in the original video to make subtitles. Then I would have to weave the transcriptions into the audio description script to create special subtitles for deaf-blind users. You never know what they might be interested in, no matter how niche your videos are.

And then it'd all be in vain. Nobody would watch the videos, either because they'd be way too long or because nobody is interested in the topic or both. Thus, there wouldn't be any comments on them.

So I wouldn't know if I had done everything right by being 100% compliant with WCAG 2.2 and the Fediverse accessibility requirements. Maybe I've missed something and not been thorough enough, but I wouldn't know. Or maybe I've completely overdone it and rendered the video unwatchable by being utterly overcompliant with accessibility requirements, and a tiny fraction of what I've done would have been sufficient. But since nobody would ever comment, I wouldn't know.

#Fediverse #Accessibility #A11y #Inclusion #MediaDescription #MediaDescriptions #AudioDescription #AudioDescriptions #Long #LongPost #CWLong #CWLongPost

#mediadescriptions #audiodescriptions #mediadescription #audiodescription #accessibility #cwlongpost