-
-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalizing English language tag #3100
Comments
🤖 this is your friendly neighborhood build bot announcing test build 6.7.263.7430 ("fixes #3100") This update may name other issues, but the build just dropped here is for you; it just means problems already fixed in other issues have been folded into the work we are doing here. Install in Zotero by downloading test build 6.7.263.7430, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...". |
This works well; thank you! I would suggest two improvements:
pandoc --citeproc -t plain << EOT
---
references:
- id: example
author: "Author"
title: "Example title"
language: "en-GB"
issued:
year: 2024
---
Citation: [@example].
EOT
pandoc --citeproc -t plain << EOT
---
references:
- id: example
author: "Author"
title: "Example title"
language: "en la"
issued:
year: 2024
---
Citation: [@example].
EOT Many thanks again! |
A word of caution, though: In CSL, the From https://docs.citationstyles.org/en/stable/specification.html#appendix-iv-variables (note the singular!):
The reason the Unfortunately, there is no CSL variable indented to record the language(s) the content of a work is written in (for this purpose, biblatex has |
But then there's no benefit to adding En-US over just en. |
I’d still recommend not throwing away information, so I’d always import something like In any case, keeping language-plus-locale tags is essential when exporting to biblatex, as biblatex can also modify hyphenation, punctuation, and localised terms, all of which might differ between, say, From the current biblatex manual:
|
But that doesn't apply to CSL, right? |
Right. From a CSL (processor) perspective, it currently does not matter if it’s That being said, The OP was about normalising upon import after all, where I would continue to argue that throwing away available information (e.g., by ‘normalising’ from |
That is not my understanding - I think the OP was talking about items already in Zotero, and that during that import (from whatever source) the dates end up being a hodgepodge (likely so no information is discarded), and how they could be normalized on CSL export. I don't have CSL import, just export. The reason I'd prefer to leave it as |
Yes, my aim is purely to export items from Zotero into valid CSL JSON, for use in Pandoc. While currently this only changes whether title case is applied, I plan to see whether language tagging can also be applied to citations, if this field can be normalized reliably. I hadn't realized that it was against the spec to list more than one language tag. In that case, if more than one is recorded in Zotero, perhaps only the first could be kept? If I can get Pandoc to output language tagging with citations, it could be useful to be able to distinguish between, for example, |
Zotero doesn't really have the concept of multiple language stored per item. It's a single free-form string. I can take a look later next week what I can do about locales. It may in the end be simpler but it's not now. The language normalizer in BBT scripts off of babel's language configs, and I don't recall how much flexibility I kept in that process. |
Debug log ID
FH3W5CKW-refs-euc/6.7.263-7
What happened?
The CSL spec indicates that the
language
field should provide ISO 639-1 language tags (i.e. IETF tags). Hence, pandoc-citeproc follows this to the letter and will only apply title case to items either with no language specified or with the tagen
. Unfortunately, Zotero does not normalize this on import, and many items end up with non-IETF tags in the language field, mostly ISO 639-2 codes, which triggers an undesired sentence-case citation. It would be most helpful if BBT could convert ISO 639-2 to ISO 639-1 language codes, and perhaps also normalize strings such asEnglish
toen
.The text was updated successfully, but these errors were encountered: