0

So I asked this question yesterday but I realized I didn't do a good job of explaining it. I basically need to read different csv files with different numbers of columns and I need to build a command that automatically puts it in a table. The columns should be able to wrap to allow long texts and their widths should be adjustable. I would prefer that the tabular arguments can be passed from the command as a third argument but I could not get it to work.

    \documentclass[12]{article}
    \usepackage{import}
    \usepackage{csvsimple}
    \usepackage[a4paper, total={6.25in, 9.75in}]{geometry}
\begin{document}

\begin{filecontents*}{1.csv}
    title 1,title 2,title 3,title 4,title 5,title 6,title 7
    78,1,1,16,7,1,9
    03,1,1,32,7,1,9
    98,1,2,16,8,2,9
    23,1,2,32,8,2,9
    43,1,4,16,10,4,9
    52,1,4,32,10,4,9
\end{filecontents*}


\begin{filecontents*}{2.csv}
    name,type,random
    sample 1,type 1,Lorem Ipsum has been the industry's standard dummy text ever since the 1500s when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries but also the leap into electronic typesetting remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum
    sample 2,type 2,There are many variations of passages of Lorem Ipsum available but the majority have suffered alteration in some form, by injected humour or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum you need to be sure there isn't anything embarrassing hidden in the middle of text
    sample 3,type 3,The standard chunk of Lorem Ipsum used since the 1500s is reproduced below for those interested. Sections 1.10.32 and 1.10.33 from
    sample 4,type 4,de Finibus Bonorum et Malorum" by Cicero are also reproduced in their exact original form
    sample 5,type 5,The generated Lorem Ipsum is therefore always free from repetition injected humour or non-characteristic words etc.
\end{filecontents*}



%to run the different files comment out one after head line and change file in \csvautotabularcenter
\makeatletter
\csvset{
autotabularcenter/.style={
    file=#1,
    after head=\csv@pretable\begin{tabular}{|p{2cm}| p{1cm}| p{12cm}|}\csv@tablehead,
    %after head=\csv@pretable\begin{tabular}{|p{1cm}|p{1cm}|p{1cm}|p{3cm}|p{2cm}|p{1cm}|p{4cm}|}\csv@tablehead,     %to be able to run 1.csv need to change tabular arguments
    table head=\hline\csvlinetotablerow\\\hline,
    late after line=\\,
    table foot=\\\hline,
    late after last line=\csv@tablefoot\end{tabular}\csv@posttable,
    command=\csvlinetotablerow},
}
\makeatother
\newcommand{\csvautotabularcenter}[2][respect underscore=true]{\csvloop{autotabularcenter={#2},#1}}    
\csvautotabularcenter{2.csv} %pass tabular arguments (eg |p{2cm}| p{1cm}| p{12cm}| or |p{1cm}|p{1cm}|p{1cm}|p{3cm}|p{2cm}|p{1cm}|p{4cm}|) from here

\end{document}

  • So everything the macro should be able to do is typeset a tabular from a csv file? Any other special requirements, like longtable support, or tabularx support? – Skillmon May 19 '22 at 15:53
  • no nothing like that. It just need to be able to define the width of the columns. One thing is that i dont want to manually put in the data from the table because that defeats the whole purpose of automating tables – user270659 May 19 '22 at 15:57
  • Please note that your second csv is malformatted (line two has a , in the text, so one column more than the others). – Skillmon May 19 '22 at 17:20
  • thats ok this is just an example i just wanted some random text. I just wanted to make sure the text would go to the next line if its long enough – user270659 May 19 '22 at 17:32
  • Oh, also 12 is no option of the standard classes, I guess you meant to use 12pt. – Skillmon May 19 '22 at 18:02

1 Answers1

1

The following implements \tabgen (from scratch, using only expl3-code, no other libraries).

The macro first builds the table body completely, and then typesets the table. It has a few key=value options to customize, hopefully the comments will be enough to understand them.

\documentclass[12pt]{article}

\usepackage[a4paper, total={6.25in, 9.75in}]{geometry}

\begin{filecontents}{1.csv} title 1,title 2,title 3,title 4,title 5,title 6,title 7 78,1,1,16,7,1,9 03,1,1,32,7,1,9 98,1,2,16,8,2,9 23,1,2,32,8,2,9 43,1,4,16,10,4,9 52,1,4,32,10,4,9 \end{filecontents}

\begin{filecontents}{2.csv} name;type;random sample 1;type 1;_^Lorem Ipsum has been the industry's standard dummy text ever since the 1500s when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries but also the leap into electronic typesetting remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum sample 2;type 2;There are many variations of passages of Lorem Ipsum available but the majority have suffered alteration in some form, by injected humour or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum you need to be sure there isn't anything embarrassing hidden in the middle of text sample 3;type 3;The standard chunk of Lorem Ipsum used since the 1500s is reproduced below for those interested. Sections 1.10.32 and 1.10.33 from sample 4;type 4;de Finibus Bonorum et Malorum" by Cicero are also reproduced in their exact original form sample 5;type 5;The generated Lorem Ipsum is therefore always free from repetition injected humour or non-characteristic words etc. \end{filecontents}

\ExplSyntaxOn \tl_new:N \l__tabgen_body_tl \tl_new:N \l__tabgen_line_tl \seq_new:N \l__tabgen_line_seq \int_new:N \l__tabgen_cols_int \ior_new:N \l__tabgen_file_stream \keys_define:nn { tabgen } { % environment to use env .tl_set:N = \l__tabgen_env_tl ,env .initial:n = tabular % extra argument to the environment (between env and col) ,env-arg .tl_set:N = \l__tabgen_envarg_tl % column-spec to use (as mandatory argument to env) ,col .tl_set:N = \l__tabgen_col_tl % if no explicit col is given use this for each column (if empty uses p{}) ,auto-col .tl_set:N = \l__tabgen_autocol_tl % before title line ,top .tl_set:N = \l__tabgen_top_tl % between title line and first line ,mid .tl_set:N = \l__tabgen_mid_tl ,mid .initial:n = \ % after last line ,bot .tl_set:N = \l__tabgen_bot_tl % end of each normal line ,eol .tl_set:N = \l__tabgen_eol_tl ,eol .initial:n = \ % beginning of each normal line ,bol .tl_set:N = \l__tabgen_bol_tl % column separator in input file ,sep .tl_set:N = \l__tabgen_sep_tl ,sep .initial:n = {,} ,str .bool_set:N = \l__tabgen_str_bool ,replace .tl_set:N = \l__tabgen_replace_tl } \cs_generate_variant:Nn \prg_replicate:nn { ne } \cs_generate_variant:Nn \tl_replace_all:Nnn { NV } \cs_generate_variant:Nn \seq_set_split:Nnn { NVV } \NewDocumentCommand \tabgen { O{} m } { \group_begin: \keys_set:nn { tabgen } {#1} \tabgen_read_file:n {#2} \tabgen_replace:V \l__tabgen_replace_tl \tl_if_empty:NTF \l__tabgen_envarg_tl { \tabgen_output:VVV \l__tabgen_env_tl \l__tabgen_col_tl \l__tabgen_body_tl } { \tabgen_output:VVVV \l__tabgen_env_tl \l__tabgen_col_tl \l__tabgen_body_tl \l__tabgen_envarg_tl } \group_end: } \NewDocumentCommand \tabgenSetup { m } { \keys_set:nn { tabgen } {#1} } \cs_new:Npn \tabgenHead {} \cs_new_protected:Npn __tabgen_autocol: { \int_set:Nn \l__tabgen_cols_int { \seq_count:N \l__tabgen_line_seq } \tl_set:Nx \l__tabgen_col_tl { \prg_replicate:ne \l__tabgen_cols_int { \tl_if_empty:NTF \l__tabgen_autocol_tl { p { \dim_eval:n { \linewidth / \l__tabgen_cols_int - 2 \tabcolsep } } } { \exp_not:V \l__tabgen_autocol_tl } } } } \cs_new_protected:Npn __tabgen_head_line: { \ior_get:NN \l__tabgen_file_stream \l__tabgen_line_tl \seq_set_split:NVV \l__tabgen_line_seq \l__tabgen_sep_tl \l__tabgen_line_tl \tl_if_empty:NT \l__tabgen_col_tl { __tabgen_autocol: } \tl_clear:N \l__tabgen_line_tl \seq_map_inline:Nn \l__tabgen_line_seq { \tl_put_right:Nn \l__tabgen_line_tl { & {##1} } } \tl_set:Nx \l__tabgen_line_tl { \tl_tail:N \l__tabgen_line_tl } \cs_set_eq:NN \tabgenHead \l__tabgen_line_tl } \cs_new_protected:Npn __tabgen_body:nN #1#2 { #2 \l__tabgen_file_stream \l__tabgen_line_tl { \tl_replace_all:Nnn \l__tabgen_line_tl {#1} { & } \tl_put_right:Nx \l__tabgen_body_tl { \exp_not:V \l__tabgen_bol_tl \exp_not:V \l__tabgen_line_tl \exp_not:V \l__tabgen_eol_tl } } } \cs_new_protected:Npn \tabgen_replace:n { \keyval_parse:NNn __tabgen_replace_err:n __tabgen_replace:nn } \msg_new:nnn { tabgen } { missing-replacement } { Missing~ replacement~ for~ input~ #1 } \cs_new_protected:Npn __tabgen_replace_err:n { \msg_error:nnn { tabgen } { missing-replacement } } \cs_new_protected:Npn __tabgen_replace:nn { \tl_replace_all:Nnn \l__tabgen_body_tl } \cs_generate_variant:Nn \tabgen_replace:n { V } \cs_generate_variant:Nn __tabgen_body:nN { V } \cs_generate_variant:Nn __tabgen_body:nN { e } \cs_new_protected:Npn \tabgen_read_file:n #1 { \ior_open:Nn \l__tabgen_file_stream {#1} __tabgen_head_line: \tl_set:Nx \l__tabgen_body_tl { \exp_not:V \l__tabgen_top_tl \exp_not:V \l__tabgen_line_tl \exp_not:V \l__tabgen_mid_tl } \bool_if:NTF \l__tabgen_str_bool { __tabgen_body:eN { \tl_to_str:N \l__tabgen_sep_tl } \ior_str_map_variable:NNn } { __tabgen_body:VN \l__tabgen_sep_tl \ior_map_variable:NNn } \ior_close:N \l__tabgen_file_stream \tl_put_right:NV \l__tabgen_body_tl \l__tabgen_bot_tl } \cs_new_protected:Npn \tabgen_output:nnn #1#2#3 { \begin {#1} {#2} #3 \end{#1} } \cs_generate_variant:Nn \tabgen_output:nnn { VVV } \cs_new_protected:Npn \tabgen_output:nnnn #1#2#3#4 { \begin {#1} {#4} {#2} #3 \end{#1} } \cs_generate_variant:Nn \tabgen_output:nnnn { VVVV } \ExplSyntaxOff

\usepackage{booktabs} \usepackage{siunitx} \usepackage{tabularx}

\newcommand\gobble[1]{} % used to gobble an \addlinespace after \midrule

\tabgenSetup { top = \toprule ,mid = \\midrule ,bot = \bottomrule }

\begin{document} \noindent \tabgen[auto-col=S]{1.csv}

\noindent \tabgen [ sep=;, col=llX, env=tabularx, env-arg=\linewidth, mid=\\midrule\gobble, bol=\addlinespace, replace={=\,^=^{}} ] {2.csv} \end{document}

enter image description here


Example usage for a typical longtable setup:

% packages longtable, siunitx, and booktabs loaded in the preamble
\tabgen
  [
     auto-col=S
    ,env=longtable
    ,top={\caption{Test 1}\\\toprule}
    ,mid=
      {
        \\\midrule\endfirsthead
        \caption[]{Test 1 (continued)}\\\toprule\tabgenHead\\\midrule\endhead
        \bottomrule\endfoot
      }
    ,bot={}
  ]{1.csv}
Skillmon
  • 60,462
  • when I run the code the text does not wrap it just goes out to the right. – user270659 May 19 '22 at 18:06
  • @user270659 Well, the output I show is the output I get (without the 12pt option). Is your TeXLive installation up to date? Did you change the separator in 2.csv to ; (LaTeX will not overwrite it if it already exists, so you either need to remove it or have to change it in the existing file)? – Skillmon May 19 '22 at 18:11
  • thank you, changing , to ; worked. Is the program sill going to run if the separators are ,'s because the original csv files wont have ;. Also I dont know how to set the lengths of each columns separately – user270659 May 19 '22 at 18:16
  • @user270659 yesn't on the first question. The macro doesn't know which comma would be one that should separate columns and which is part of a column (you could use {} in your csv, but that requires retouching as well). So if your input file uses , but contains unprotected , as part of a column text, that row would have extraneous &s and therefore throw low-level extra alignment errors. If you neither set col nor auto-col you'll get auto-balanced p-type columns. If that's not what you want you can use col=p{1cm}p{2cm}p{3cm} for three columns of different widths. – Skillmon May 19 '22 at 18:29
  • If that splitting behaviour is a problem for you, I could change things to only split on the first $n$ separators (with throwing a warning if more), and $n$ could be auto-determined by the headrow. – Skillmon May 19 '22 at 18:30
  • the real data will not have unprotected ,'s so that shouldn't be an issue. One last thing is there a way to insert vertical lines between the columns and to the sides – user270659 May 19 '22 at 18:38
  • @user270659 that is widely regarded bad typography in the western world (Asian documents differ in that regard though). But yes, this is still possible, you could use | in the col argument (but the automatic column setups don't support it I'm afraid). For instance in the second call of \tabgen in the example above you could use col=ll|X instead of col=llX. – Skillmon May 19 '22 at 18:54
  • What about horizontal lines are they possible after each row? And can _ and * be used in the csv file. If we need to add {} for them that wont be an issue. – user270659 May 19 '22 at 19:03
  • @user270659 everything except the separator and line endings is interpreted as normal LaTeX code. If you'd prefer I could change this to strings rather easily. If you want to get horizontal lines just the \addlinespace in the second example to \midrule or \hline. – Skillmon May 19 '22 at 19:16
  • I tried adding {} to the commas and that worked but when the file has _ or * it doesnt work anymore. Is there a fix for this like a respect underscore=true argument? – user270659 May 19 '22 at 19:52
  • @user270659 as I said, rest is treated as normal LaTeX input, _ has a meaning in LaTeX (math subscript) and errs out. I'll add a stringify switch. – Skillmon May 19 '22 at 20:00
  • @user270659 if you now use str=true in the options only the first row is interpreted as normal LaTeX input, rest is treated as a string. Note that output depends on font and engine (in pdfLaTeX you'll not get _^ output as you'd expect with normal Computer Modern -- you'd have to use \ttfamily or similar; in LuaLaTeX or XeLaTeX everything should work just fine). – Skillmon May 19 '22 at 20:15
  • Another possibility would be to use replace={_=\_, ^=\^{}} (or something similar, note that this doesn't work for stuff in braces, and it is rather slow, I'd prefer str=true). – Skillmon May 19 '22 at 20:29
  • if there a way to put a table caption so it automatically produces a caption when we input it through \tabgen. Also where would i put str=true because it doesnot work when i put it next to auto-col, – user270659 May 20 '22 at 13:27
  • @user270659 make sure to also put a comma between auto-col's value and str=true (so, e.g., auto-col=c, str=true). Since \tabgen only produces the tabular environment you can simply use \begin{table}\caption{generated table}\tabgen{1.csv}\end{table}. – Skillmon May 20 '22 at 17:09
  • did you remove the ability to add unprotected , in one of the later edits because i cannot add {,}. It gives me an error. – user270659 May 24 '22 at 14:53
  • @user270659 if you're using the str=true option you can't use protected commas. There never was the possibility to add unprotected commas to fields, each comma is interpreted as the start of a new column, and if a row has more columns than the first row this will result in an error (or you'll have to manually define the columns using the col-key, in which case you can specify the number of columns and won't get an error, but still a new column for each comma). – Skillmon May 24 '22 at 15:41
  • is there no way to add unprotected commas with str=true even if i manually define columns? – user270659 May 24 '22 at 15:49
  • As I already said, each comma is considered the start of a new column. I could patch in to only care for the first n commas and keep the rest, but that's not a clean solution, imho, and generally not how CSV works. – Skillmon May 25 '22 at 06:40
  • I figured that if i just save the csv with ; as separators like you did in the original code it solves all of the problems. One last thing, I have a table thats too long to fit on one page, is it possible that it can split the table over two or more pages if necessary? – user270659 May 25 '22 at 14:26
  • @user270659 sure, use the following options: env=longtable, mid=\\\midrule\endhead\bottomrule\endfoot, bot={} and put \usepackage{longtable} in your preamble. – Skillmon May 25 '22 at 16:15
  • @user270659 note that longtable shouldn't be used in a table-environment. Instead you can use a \caption directly inside of longtable's body (just take a look at its documentation). – Skillmon May 25 '22 at 16:19
  • Thank you so much for your help – user270659 May 25 '22 at 18:11
  • can an argument be added that allows special characters like Ω and ± to be read from the csv file and displayed properly? – user270659 May 30 '22 at 14:46
  • @user270659 use an Unicode aware engine like LuaLaTeX. Alternatively set the symbols you need up with \DeclareUnicodeCharacter. Since input is interpreted as normal LaTeX input (with str=false) the usual methods apply. See also this: https://tex.stackexchange.com/questions/34604/entering-unicode-characters-in-latex – Skillmon May 30 '22 at 16:59
  • when using the long table function where do i add the caption? '\newcommand{\customlongtable}[2]{% \begin{center} \tabgen[str=true, sep=;, env=longtable, #2]{#1} %\captionof{test} \end{center} }' i made this command but it wont let me add the caption – user270659 May 30 '22 at 17:06
  • @user270659 I've included a usage example for longtable (note that longtable is quite different from other table environments with its \endfirsthead, \endhead, \endfoot constructs). Please note that the example requires an adaptation in the code above (I've added \tabgenHead to access the contents of the first row multiple times). – Skillmon May 31 '22 at 06:54
  • @user270659 also note that \captionof takes two arguments, correct usage would be \captionof{table}{Test} (but still the usage with longtable as you did would be wrong). – Skillmon May 31 '22 at 06:55