The following uses an L3 sequence and the L3 md5sum function to implement your \calcHash. Note that the \calcHash is used where it is, not stored in some other macro which is then assigned to the next content.
\documentclass[10pt,a4paper]{article}
\usepackage{pgfplotstable}
\pgfplotsset{compat=newest}
\ExplSyntaxOn
\str_new:N \l__pascals_hash_str
\seq_new:N \g__pascals_hashes_seq
\msg_new:nnn { pascals } { duplicate-hash }
{ Hash~ #1~ already~ used! }
\cs_generate_variant:Nn \str_set:Nn { Ne }
\cs_new_protected:Npn __pascals_calc_hash:n #1
{
\str_set:Ne \l__pascals_hash_str { \str_mdfive_hash:e {#1} }
\seq_if_in:NVTF \g__pascals_hashes_seq \l__pascals_hash_str
{ \msg_error:nnV { pascals } { duplicate-hash } \l__pascals_hash_str }
{ \seq_gput_right:NV \g__pascals_hashes_seq \l__pascals_hash_str }
\pgfkeyslet { /pgfplots/table/create~ col/next~ content } \l__pascals_hash_str
}
\NewDocumentCommand \clearHashes {} { \seq_gclear:N \g__pascals_hashes_seq }
\NewDocumentCommand \calcHash { m } { __pascals_calc_hash:n {#1} }
\ExplSyntaxOff
\pgfplotstableread[]{
X Y
1 a
2 b
5 c
}\mydata
\begin{document}
\clearHashes
\pgfplotstablecreatecol[
create col/assign/.code={%
\calcHash{\thisrow{X}\thisrow{Y}}%
}]{ID}{\mydata}
\pgfplotstablegetrowsof{\mydata}
\pgfmathtruncatemacro\myDataRows{\pgfplotsretval-1}
\pgfplotstabletypeset[string type]{\mydata}
\end{document}
A variant that only uses the first three tokens from the resulting hash:
\documentclass[10pt,a4paper]{article}
\usepackage{pgfplotstable}
\pgfplotsset{compat=newest}
\ExplSyntaxOn
\str_new:N \l__pascals_hash_str
\seq_new:N \g__pascals_hashes_seq
\msg_new:nnn { pascals } { duplicate-hash }
{ Hash~ #1~ already~ used! }
\cs_generate_variant:Nn \str_set:Nn { Ne }
\cs_generate_variant:Nn \str_range:nnn { e }
\cs_new_protected:Npn __pascals_calc_hash:n #1
{
\str_set:Ne \l__pascals_hash_str
{ \str_range:enn { \str_mdfive_hash:e {#1} } { 1 } { 3 } }
\seq_if_in:NVTF \g__pascals_hashes_seq \l__pascals_hash_str
{ \msg_error:nnV { pascals } { duplicate-hash } \l__pascals_hash_str }
{ \seq_gput_right:NV \g__pascals_hashes_seq \l__pascals_hash_str }
\pgfkeyslet { /pgfplots/table/create~ col/next~ content } \l__pascals_hash_str
}
\NewDocumentCommand \clearHashes {} { \seq_gclear:N \g__pascals_hashes_seq }
\NewDocumentCommand \calcHash { m } { __pascals_calc_hash:n {#1} }
\ExplSyntaxOff
\pgfplotstableread[]{
X Y
1 a
2 b
5 c
}\mydata
\begin{document}
\clearHashes
\pgfplotstablecreatecol[
create col/assign/.code={%
\calcHash{\thisrow{X}\thisrow{Y}}%
}]{ID}{\mydata}
\pgfplotstablegetrowsof{\mydata}
\pgfmathtruncatemacro\myDataRows{\pgfplotsretval-1}
\pgfplotstabletypeset[string type]{\mydata}
\end{document}
Yet another variant, this one defaulting to using the full hash, but with an optional argument to only use the first n characters.
\documentclass[10pt,a4paper]{article}
\usepackage{pgfplotstable}
\pgfplotsset{compat=newest}
\ExplSyntaxOn
\str_new:N \l__pascals_hash_str
\seq_new:N \g__pascals_hashes_seq
\msg_new:nnn { pascals } { duplicate-hash }
{ Hash~ #1~ already~ used! }
\cs_generate_variant:Nn \str_set:Nn { Ne }
\cs_generate_variant:Nn \str_range:nnn { e }
\cs_new_protected:Npn __pascals_calc_hash:nn #1#2
{
\str_set:Ne \l__pascals_hash_str
{ \str_range:enn { \str_mdfive_hash:e {#1} } { 1 } {#2} }
\seq_if_in:NVTF \g__pascals_hashes_seq \l__pascals_hash_str
{ \msg_error:nnV { pascals } { duplicate-hash } \l__pascals_hash_str }
{ \seq_gput_right:NV \g__pascals_hashes_seq \l__pascals_hash_str }
\pgfkeyslet { /pgfplots/table/create~ col/next~ content } \l__pascals_hash_str
}
\NewDocumentCommand \clearHashes {} { \seq_gclear:N \g__pascals_hashes_seq }
\NewDocumentCommand \calcHash { O{-1} m } { __pascals_calc_hash:nn {#2} {#1} }
\ExplSyntaxOff
\pgfplotstableread[]{
X Y
1 a
2 b
5 c
}\mydata
\begin{document}
\clearHashes
\pgfplotstablecreatecol[
create col/assign/.code={%
\calcHash[3]{\thisrow{X}\thisrow{Y}}%
}]{ID}{\mydata}
\pgfplotstablegetrowsof{\mydata}
\pgfmathtruncatemacro\myDataRows{\pgfplotsretval-1}
\pgfplotstabletypeset[string type]{\mydata}
\end{document}
\calcHash{1a},\calcHash{2b}and\calcHash{5c}are not equal. There should be only an error, when I add again in the 4th row an1afor example. – PascalS Mar 11 '24 at 12:44\cs_generate_variant:Nnfor the most likely culprit, you might want to test again with the current code. If it still errs for you, please provide the error message. – Skillmon Mar 11 '24 at 12:57\StrLeftin my MWE? 2. What are you using\clearHashesfor? I have commented it out, without any recognizable effect. – PascalS Mar 11 '24 at 13:46\clearHashesfor that (it clears the list of known hashes). For only the first three letters of the hash, see my edit. – Skillmon Mar 11 '24 at 14:40\seq_if_in:NVTF, the previous version of all three code blocks had undefined behaviour if the hash was already in the sequence, this is now fixed. – Skillmon Mar 11 '24 at 14:48\clearHashesis clear now! It's good to have it in this answer, but for me it is not relevant, because I have more than one table and the clue is to check all of these tables for the same duplicates :) – PascalS Mar 11 '24 at 14:56\hashvalue result to three digits... Otherwise I have to do it twice. Once in my Tikz Pictures and the second time in my Tables... – PascalS Mar 11 '24 at 14:57! Undefined control sequence. <argument> \str_mdfive_hash:eEven withThis is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) (preloaded format=pdflatex) & LaTeX2e <2022-11-01> patch level 1 & L3 programming layer <2023-05-05>This should be new enough, isn't it? – PascalS Mar 11 '24 at 20:44