Add an encoding parameter to io.load_tabby #116

Open
mslw wants to merge 2 commits from mslw/encoding2 into main
mslw commented 2023-11-21 16:20:45 +00:00 (Migrated from github.com)

This PR resolves #112 by adding an optional encoding parameter to io.load_tabby. The parameter can be used to specify encoding for reading tsv files.

When not specified (encoding=None), we keep the default behavior (implicitly using locale.getencoding() 1,2).

With external libraries it might be possible to guess a file encoding that produces a correct result based on the files content, but the success is not guaranteed when there are few non-ascii characters in the entire file (think: list of authors). I made an attempt with #114 but didn't like it in the end. Here, we do not attempt to guess, instead expecting the user to know the encoding they need to use.

This PR also fixes an unrelated documentation typo to satisfy the codespell checks.

This PR resolves #112 by adding an optional `encoding` parameter to `io.load_tabby`. The parameter can be used to specify encoding for reading tsv files. When not specified (`encoding=None`), we keep the default behavior (implicitly using `locale.getencoding()` [^1],[^2]). With external libraries it might be possible to guess a file encoding that produces a correct result based on the files content, but the success is not guaranteed when there are few non-ascii characters in the entire file (think: list of authors). I made an attempt with #114 but didn't like it in the end. Here, we do not attempt to guess, instead expecting the user to know the encoding they need to use. This PR also fixes an unrelated documentation typo to satisfy the codespell checks. [^1]: https://docs.python.org/3/library/pathlib.html#pathlib.Path.open [^2]: https://docs.python.org/3/library/functions.html#open
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin mslw/encoding2:mslw/encoding2
git switch mslw/encoding2

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch main
git merge --no-ff mslw/encoding2
git switch mslw/encoding2
git rebase main
git switch main
git merge --ff-only mslw/encoding2
git switch mslw/encoding2
git rebase main
git switch main
git merge --no-ff mslw/encoding2
git switch main
git merge --squash mslw/encoding2
git switch main
git merge --ff-only mslw/encoding2
git switch main
git merge mslw/encoding2
git push origin main
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
sfb1451/datalad-tabby!116
No description provided.