User talk:Xover

From Wikisource
Latest comment: 19 days ago by EncycloPetey in topic Index:Orlando by Virginia Woolf.djvu
Jump to navigation Jump to search


Running headers (and footers)

[edit]

I've only just realized that you're the one who created the index and most, if not all of the pages for In Other Words, and proofread them, and now I'm muddying up your work with fussy edits... your patience must be truly formidable!

Earlier pages all placed the title of the book or poem in the header using the {{rh}} running header template; later ones use the {{c}} center template. I've been replacing the latter with running header, but since there's only one thing to include—and it varies in terms of whether it's used and whether it consists of the book title or current poem—I was already wondering why running header was needed, before I realized that the person who just gave me a better understanding of how to use templates is the one who placed the ones I was replacing.

So is there any reason to use running header in this instance, or would it be better converted back to plain center? And if there's no advantage, then what distinguishes it from the running header template consistently used in the footer, which contains only a centered page number? At least that occurs on every page, though I don't know whether it should make a difference.

Meanwhile, based on our previous discussion, I've been using {{c/s}} and {{c/e}} constantly, in some cases replacing {{c|{{foo|bar}}}} that you had made... I really don't want to make a mess of things. Is it more a judgment call based on the complexity of nested templates than a question of one being superior to the other? Sorry to have so many questions... P Aculeius (talk) 00:47, 27 June 2024 (UTC)Reply

Replacing the Illustrator template

[edit]

Hello. I have noticed you replaced the deprecated {{Illustrator}} template for the built-in parameter "illustrator", which is great. However, when the parameter was used for a specific subpage only and not for other subpages of the work, such as here, then the parameter "section_illustrator" should be usually used. Do you think it could be possible (and not too difficult) to find all other cases to change the parameter? -- Jan Kameníček (talk) 15:46, 1 July 2024 (UTC)Reply

@Jan.Kamenicek: I'm working on this, but it overlaps with a couple of other kinda knotty issues so it's taking some time. Xover (talk) 10:11, 12 July 2024 (UTC)Reply
I'll follow up on this at WS:BR#Replace illustrator header parameter with section_illustrator in subpages of works. Xover (talk) 12:05, 21 July 2024 (UTC)Reply

Portal:Federal_Government_of_the_United_States/Tab1

[edit]

https://en.wikisource.org/w/index.php?title=Portal:Federal_Government_of_the_United_States/Tab1&action=edit&lintid=3236265

This was showing up as having fostered content , but in looking at it, the only thing I can think of is the <onlyinclude></onlyinclude> which should be ignored by the linter?

I think this is a false positive detection, along with the <section></section> tag issue on many of the other pages detected. ShakespeareFan00 (talk) 07:08, 5 July 2024 (UTC)Reply

It's probably just a false positive, yes. That whole portal should also be redesigned (it's copied over from enWP and doesn't work for enWS) without these pseudo-tabs, so I'm disinclined to spend much time on fixing what's there now. Xover (talk) 10:25, 12 July 2024 (UTC)Reply

Template:Information

[edit]

I thought this had a box border around it? (It does on Commons). I'd been doing a lot of edits in preperation for "night mode" and wanted a second opinion as to those edits having caused the border to vanish. (I did check the underlying Styles and did not find anything that obviously that could have caused the border to vanish. ShakespeareFan00 (talk) 17:33, 6 July 2024 (UTC)Reply

I just synced the template with Commons, so hopefully this problem should be obviated. —CalendulaAsteraceae (talkcontribs) 15:20, 7 July 2024 (UTC)Reply
Thanks. I have on my todo to reimplement {{book}} and {{information}} from scratch because the Commons versions of these are a pain in the neck and use lots of Commons-specific stuff. Xover (talk) 10:18, 12 July 2024 (UTC)Reply

PPoem

[edit]

Page:A Jewish Interpretation of the Book of Genesis (Morgenstern, 1919, jewishinterpreta00morg).pdf/338

Is this a bug or am I asking the >>> feature to do too much. ShakespeareFan00 (talk) 19:16, 11 July 2024 (UTC)Reply

@ShakespeareFan00: You're asking too much of default {{ppoem}}. >>> is initially for line numbers and such; so whenever you have longer text there will no longer be room within the right margin and things start looking wonky. You can make it work for these cases but it'll require fiddling with margins etc. in the per-work CSS. I've never been sufficiently hard up to have to do that (it's fiddly and a pain) so I don't have any existing examples to hand. Xover (talk) 10:23, 12 July 2024 (UTC)Reply
Do we need a quotation version of ppoem? ShakespeareFan00 (talk) 18:31, 12 July 2024 (UTC)Reply

Problems with automatic header

[edit]

While we're on the subject, I may as well inform you of an actual bug in the automatic header (that I didn't bother with because I wasn't going to use the automatic header anymore, and don't have access to fix it myself anyway).

If the Index page has multiple illustrators (like Index:Lange - The Blue Fairy Book.djvu), the automatic header will display this as follows: illustrated by [[Author:H. J. Ford and G. P. Jacomb Hood|H. J. Ford and G. P. Jacomb Hood]]

If you're fixing bugs in the automatic header, this might be one to look into. —Beleg Tâl (talk) 15:46, 19 July 2024 (UTC)Reply

Heh, yeah, that's the one I'm currently looking into which is why I might as well try to fix other stuff while I'm at it. :) Xover (talk) 15:56, 19 July 2024 (UTC)Reply

I'm having another issue, not sure whether it's a bug or PEBCAK. I'm trying to add a contributor field to Poetical works of Mathilde Blind/Preface for Arthur Symons, but it's not displaying in the header. Any ideas?

PS. I appreciate all your work on this :) —Beleg Tâl (talk) 15:22, 23 July 2024 (UTC)Reply

If I understand it correctly, the <pages> tag can only pass its parameters to the ProofreadPage header template in order to override a field provided by the ProofreadPage Index template, and can't add additional parameters. So I'm thinking that we'd need to create a hidden field in the Index page for "contributor" (and any other such fields) that is always empty, which can then be overridden. Would that work? —Beleg Tâl (talk) 15:37, 23 July 2024 (UTC)Reply
@Beleg Tâl: That is the way it has to be done, yes. I'm hoping it may be possible to get something more flexible long term, but for now that's the approach. Xover (talk) 12:14, 24 July 2024 (UTC)Reply

poem tag question

[edit]

On Wikisource:Scriptorium/Help I have an open topic about the poem tag formatting carrying over into footnotes included inside the tag. I've had no responses yet. --EncycloPetey (talk) 22:49, 12 August 2024 (UTC)Reply

Archive of Files missing machine-readable data?

[edit]

Both categories have been cleaned up, and there are now only a few files that go in and out, so I thought maybe we should remove the {{DNAU}} and let the bots archive. Fine with that? — Alien333 (what I did & why I did it wrong) 18:10, 17 August 2024 (UTC)Reply

@Alien333: Indeed. Thanks for the reminder. Xover (talk) 17:24, 18 August 2024 (UTC)Reply

Daily News

[edit]

Not really your fault, since Daily News is such a generic title, but you’ve got the wrong Daily News. Our Daily News was created for Daily News/1940/12/24/Cheated Death In Air Battles, Dies In Crash, which is for the New York Daily News, while G.K. Chesterton contributed to the Daily News of London (see The Daily News (UK) on Wikipedia). I’ll try to get scans of the relevant articles, but I can’t promise anything insofar as British newspapers (and library holdings) are concerned. TE(æ)A,ea. (talk) 17:54, 24 August 2024 (UTC)Reply

@TE(æ)A,ea.: Thanks. I'm not sure I can absolve myself of sloppiess here, because I really should have caught that. I've dab'ed the two and updated links etc. Interestingly, almost all incoming links were intended for the London magazine, so the New York title was somewhat of a squatter. Xover (talk) 18:56, 24 August 2024 (UTC)Reply

IRC #wikisource

[edit]

Is there still any discussion over there? Because the few times I poked my nose around there wasn't. — Alien333 ( what I did
why I did it wrong
) 14:18, 28 August 2024 (UTC)Reply

@Alien333: Very rarely. But most IRC channels are pretty low-volume these days, so I don't know that #wikisource is any worse. Xover (talk) 08:07, 29 August 2024 (UTC)Reply

Page:Hans Andersen's Fairy Tales (1888).djvu/472 and {{img float}}

[edit]

I saw it mentioned on your page, but I'm pretty sure that there's no need of any further technical work, as {{overfloat image}} fits perfectly (see that page). — Alien333 ( what I did
why I did it wrong
) 12:16, 3 September 2024 (UTC)Reply

@Alien333: The issue isn't with Andersen /472, it's with {{img float}}. Feel free to remove the hidden comment as it was just a reminder / todo for myself about the issue. Xover (talk) 12:25, 3 September 2024 (UTC)Reply

Orlando Furioso v4

[edit]

Could you please generate a DjVu file from File:Orlando Furioso (Rose) v4 1825.pdf? Seven of the eight volumes were available at IA, and have been uploaded to commons:Category:Orlando Furioso (Rose), but volume 4 does not exist there for some reason. TE(æ)A,ea. (talkcontribs) was kind enough to acquire and provide a PDF, but I would prefer a DjVu, so that the whole series is in the same format (and because of the numerous technical issues we're having with PDFs). The DjVu should be named File:Orlando Furioso (Rose) v4 1825.djvu to match the naming pattern for the rest of the series. --EncycloPetey (talk) 20:12, 3 September 2024 (UTC)Reply

@EncycloPetey: File:Orlando Furioso (Rose) v4 1825.djvu. IA has a scan and HathiTrust has several, it's just that UCal seems to be missing vol. 4 from their physical collection so it's missing in that scan series. I grabbed one of the Harvard copies and uploaded that since it seemed to be decent quality and then I wouldn't have to deal with Google's terrible PDFs. Xover (talk) 15:40, 4 September 2024 (UTC)Reply
I found my IA copies using a search, which did not turn up a copy of volume 4. And if you look at the pattern in the local IDs, you can infer what we be correct for volume 4 in the set I found, but it's a scan of an entirely different book. I am aware of the copies at Hathi, and I asked TE(æ)A,ea. if one could be provided, but there were complications, I gather, from subsequent conversation.
Well, thank, and I'll take a look today to see whether this copy is a complete scan or not. I have come across copies that were missing portions of the original. --EncycloPetey (talk) 16:43, 4 September 2024 (UTC)Reply
There is no text layer. Could you please generate a text layer for the file? I am also getting zero file size errors, which I never have had previously with DjVu files. --EncycloPetey (talk) 16:52, 4 September 2024 (UTC) This problem sorted. --EncycloPetey (talk) 16:59, 4 September 2024 (UTC)Reply
The text layer exists, but is garbled because it was generated by Google. Where there is text, there can be whole lines placed at the bottom of the page, instead of in their proper sequence, if not missing altogether from the page. I have found pages with randomized punctuation. I may be able to use the OCR tool, since this is a regular and very structured text with a relatively clean scan, but I forsee a higher error rate on this volume, and we have had recent days where the OCR tool failed or was unpredictable. --EncycloPetey (talk) 17:08, 4 September 2024 (UTC)Reply
@EncycloPetey: The text layer in the DjVu was generated by my tools (tesseract is the OCR engine), not by Google. Spot checking pages in Index:Orlando Furioso (Rose) v4 1825.djvu I see no significant problems with the text layer. On what pages are you seeing problems? Xover (talk) 17:42, 4 September 2024 (UTC)Reply
I did not keep track of which pages. I checked several dozen to be sure the scan had likely included all the relevant pages without duplicates, and noted bizarre issues like the ones I describe. But looking for a few examples now: scan page 130 has randomized start-of-line punctuation; 190 has text that does not appear on the page; 200 is one of the pages where the text was out of sequence. --EncycloPetey (talk) 17:50, 4 September 2024 (UTC)Reply
/130 is just Tesseract being really bad at quotation marks. That's a general problem with no fix. /190 is Tesseract being over-eager and detecting the text on the opposite side of the sheet. It'll mostly just happen on empty pages (because it doesn't have real text to correct against), so it's usually not a big problem. The misplaced text on /200, though, is a weird bug. Tesseract detects the relevant line correctly, and with the correct coordinates (if you load it in DjView and turn on hidden text you'll see the text positioned exactly over the letters in the scan), but the line is stored out of order in the OCR output (Tesseracts outputs a HTML-like structured format where each detected word is tagged with its coordinates on the page; normally each line is in the output in the order it is on the page, but here that line comes at the end of the output, and hence also in the plain text shown in the text box). I'm guessing this is because it is getting confused by the first-line indentation and thinking the page is a two-column layout. I'll try to see if there are any settings I can tweak or something, but I'm not hopeful and it probably won't happen soon in any case. IOW, unless the problems with this are more severe than currently apparent this is as good as it's going to get for now. Xover (talk) 18:46, 4 September 2024 (UTC)Reply

Vector 2022

[edit]

Hey, I'd like to revive the topic of making Vector 2022 the default here. Before I start a discussion in Scriptorium though, I wanted to check in with you. Do you see any issues that need to be addressed (fixed, explained, regardless) either before or reasonably shortly after deployment? Maybe we could do some things before involving more people, esp. the less technical editors. Thanks! SGrabarczuk (WMF) (talk) 19:20, 4 September 2024 (UTC)Reply

@SGrabarczuk (WMF): This is just a quick braindump before morning coffee. Once the caffeine kicks in I may regret everything and take it all back. Or something like that… 😎
I think the biggest issue is going to be general pushback from the community of the kind enWP so emphatically provided, even if somewhat more muted and on different causes. Partly that's going to be motivated by resistance to change (we have contributors still using Monobook for no articulable reason), but partly also because Vector 2022 reflects different priorities than their own. Its major focuses are things that make sense on a Wikipedia, but not so much on Wikisource; it moves around UI elements that are now harder to find and get at than before; and it reflects WMF priorities over community priorities (e.g. the language selector vs. the mw-indicators positioning). I'm afraid the community here will see little benefit in the changes Vector 2022 makes, and things like having to go to a submenu to find the link to your own user talk page will be viewed as significant drawbacks. I could be wrong, but that's my concern.
In more concrete and technical terms I'm not aware of any major things of the "breaks core workflow" variety. The new menus are breaking some Gadgets that modify them (bigChunkedUpload is the latest I've noticed). The interlanguage links we manually add to Special:RecentChanges by way of MediaWiki:Recentchangestext and {{Interwiki Wikisource}} no longer work in Vector 2022 (it works in all other skins). Vector also overlaps our Dynamic Layouts (essentially MediaWiki:Gadget-PageNumbers.js) while providing no community control, not integrating with our layout system, nor provide any facilities that make our implementation easier (the Gadget is somewhat fragile and prone to FOUC-type problems). Also, because Wikisource is so poorly supported by the WMF (so far as I can tell not a single developer has ever been allocated to Wikisource; we depend entirely on the good will of individual developers and teams with other responsibilities for everything we need) we are dependent on a large number of gadgets and user scripts that make repetitive editing tasks more efficient. Vector 2022 is designed with hiding these away as an apparent goal (stuff added to #p-toolbox is now hidden in the looong and cluttered Tools dropdown menu), leaving us with no clear way to surface editing helpers that need to be Fitts's law-compliant. The 2017 editing toolbar and Visual Editor doesn't support Wikisource (at all), and the 2010 editor is really primitive in terms of extension and integration points (see e.g. T370353 for the completely basic s...tuff that's not there). The paragraph spacing is still broken (compare the text inside the box on this page in Vector 2010 and Vector 2022), and this affects a lot of pages on enWS.
I haven't done a systematic assessment of Vector 2022 here (partly because it's a moving target, partly because I haven't had time, partly because y'all have been focussed on the Wikipedias), but I have had it set as default since the last time the issue was brought up back in March. My assessment is that the main issue with making Vector 2022 default is that the value proposition for English Wikisource—the "What's in it for me?"—is too poor when held up against both the concrete drawbacks and the need for change in general (all change has a cost; resistance to change is not inherently irrational). If the development of the skin had been more able to identify and incorporate this specific community's needs and priorities in its scope early on I think that calculus could have easily changed. But as it stands the value proposition is going to be perceived as marginal, at best, and the Wikisourcen in general have way too few technical contributors able to follow up with the Web Team to get issues fixed as they crop up (by the time the community has got its act together the team will be onto other tasks and greener pastures). Xover (talk) 06:14, 5 September 2024 (UTC)Reply
Wow, thanks for the detailed and long response, I really appreciate it! If you'd like to add something or take something back, you can also reach out to me on Discord, Telegram, lots of places - there are very few people named Szymon Grabarczuk, I'm easy to find across platforms :D SGrabarczuk (WMF) (talk) 08:43, 5 September 2024 (UTC)Reply

Index:Orlando by Virginia Woolf.djvu

[edit]

The text layer on this Index is off by one. Could you please correct this issue? It's in PD in the US this year, but transcription has been held up for months for a variety of issues. --EncycloPetey (talk) 17:47, 8 September 2024 (UTC)Reply