1

I want to plot something from the file "processor.csv" from http://cpudb.stanford.edu/cpudb.1416196069.zip.

How can I automatically escape this file or ANY OTHER to use it with pgfplots. I've already tried to edit it manually with find and replace etc. but I still get errors and in my opinion this should NOT be the way to go.

Is there no automatic way which escapes everything which has to be escaped no matter which CSV file I want to use?

This is the way I try to plot it

\begin{figure}
\begin{tikzpicture}
\begin{axis}
\addplot table [x=clock, y=created_at, col sep=comma] {./processor.csv};
\end{axis}
\end{tikzpicture}
\end{figure}
  • 2
    Could you please provide an MWE which contains everything needed in order to see your issue? I won't open a .zip from some external link. Sorry. You can have a look at this guide for how to prune your code for this purpose. Especially it would be nice to see, how you try to plot that. Like this, we do not have to code all the pgfplots stuff for you. – LaRiFaRi Jun 12 '15 at 13:40
  • Sorry but I want to plot exactly that data and these are CSV files with huge data amounts so how am I supposed to copy only parts of it when I want to use all? The question is very simple. Either there is an option/tool whatsoever to escape ANY CSV file for pgfplots or there is not. If there is no such tool I will create the plot with a different tool or use a screenshot from the website. – user2656284 Jun 12 '15 at 13:49
  • 2
    Please read the section "Data" in the answer I linked and add "ANY CSV". One line of data will be enough. Your code should be compilable and complete. Like this, we are able to copy it, try it, fix it. Thanks. – LaRiFaRi Jun 12 '15 at 13:51
  • 2
    We had a saying in my mathematics department: any number bigger than two is needlessly big. Only a few rows of data will be more than sufficient to show that a solution will work for arbitrarily large amounts of data (sans memory overflow, but there's not much to be done about that). – Sean Allred Jun 12 '15 at 14:36

1 Answers1

6

The CSV reading capabilities of pgfplots lack escaping support. Consequently, the short answer is: there no way to (un)escape arbitrary CSV columns for pgfplots, sorry.

There is also no builtin support to plot datetimes at the required precision (your data files live on a scale of deciseconds or milliseconds).

You comment indicates that you are searching for a fully fledged and robust solution which simply works in all case. In this case you may need to evaluate a different tool.


That said, there are ways to get the plot that you asked for. This stretches the limits of pgfplots beyond its builtin capabilities, and is not stable for all input types, so beware.

Regarding the question how to read the data files: pgfplots comes with support to silently ignore characters from the input files. In your case, that would suffice to read the files. The way to go would be something like

\addplot table [x=clock, y=created_at, col sep=comma, ignore chars={\#,\"}] {./processor.csv};

in which case any occurrence of either " or # would silently be stripped away. This really reads all records of the attached file.

Fortunately, this data file contains no , characters inside of the string columns (which is pure luck, of course - it would fail as soon as a data set contains a comma).

There is another limitation of pgfplots, as mentioned above: your data files use datetimes on one of the axes, and the provided data file lives on a scale of tens of seconds. Pgfplots, however, has to use the limited arithmetics offered by TeX (max 32 bit integers) and supports at most datetime ranges with minute scales.

While this could be tweaked using suitable transformations (as in pgfplots data time format), it is not an "out-of-the-box" solution. Here is a suggested adoption which supports times (but ignores the date part). It is adopted from the answer of @Jake mentioned above:

\documentclass{standalone}

\usepackage{pgfplots}

\pgfplotsset{compat=1.12}

\usepgfplotslibrary{dateplot}

% this strips the date (silently!) and considers only the time
\def\transformtime#1 #2:#3:#4.#5!{
    \pgfkeys{/pgf/fpu=true,/pgf/fpu/output format=fixed}%
    \count2=#2 \edef\hour{\the\count2}%
    \count2=#3 \edef\min{\the\count2}%
    \count2=#4 \edef\sec{\the\count2}%
    \pgfmathparse{\hour*3600-\pgfkeysvalueof{/pgfplots/timeplot zero}*3600+\min*60+\sec + 0.#5}%
    \pgfkeys{/pgf/fpu=false}%
    %\message{got #1 #2:#3:#4.#5 -> \pgfmathresult^^J}%
}

\pgfplotsset{
    timeplot zero/.initial=0,
    timeplot/.style={
        y coord trafo/.code={\expandafter\transformtime##1!},
        y coord inv trafo/.code={%
            \pgfkeys{/pgf/fpu=true,/pgf/fpu/output format=fixed}
            \pgfmathsetmacro\hours{floor(##1/3600)+\pgfkeysvalueof{/pgfplots/timeplot zero}}
            \pgfmathsetmacro\minutes{floor((##1-(\hours-\pgfkeysvalueof{/pgfplots/timeplot zero})*3600)/60)}
            \pgfmathsetmacro\seconds{##1-floor(##1/60)*60}
            \def\pgfmathresult{\pgfmathprintnumber{\hours}:\pgfmathprintnumber{\minutes}:\pgfmathprintnumber[fixed zerofill]{\seconds}}
            \pgfkeys{/pgf/fpu=false}
        },
    scaled y ticks=false,
    yticklabel=\tick,
    }
}

\begin{document}

\begin{tikzpicture}
    \begin{axis}[
        xmax=1000,
        timeplot,
    ]
    \addplot+[only marks]
        table [x=clock, y=created_at, col sep=comma, ignore chars={\#,\"}] {./processor.csv};
    \end{axis}
\end{tikzpicture}
\end{document}

enter image description here

LIMITATIONS:

  1. an escaped comma in the CSV file will break the parsing routine
  2. data points must be on one day, no sanity check in this macro
  3. your data file contains a couple of data points without x coordinate, these are dropped

The data file used to reproduce this set is

id,created_at,updated_at,manufacturer_id,processor_family_id,microarchitecture_id,code_name_id,technology_id,cache_on_id,cache_off_id,die_photo_id,model,date,clock,max_clock,hw_nthreadspercore,hw_ncores,tdp,source,bus_width,transistors,die_size,vdd_low,vdd_high
1,"2014-11-17 00:01:26.708453","2014-11-17 00:01:26.708453",9,1,,1,,1,2,,3.60E,,3600,,1,1,110,http://ark.intel.com/products/27089/64-bit-Intel-Xeon-Processor-3_60E-GHz-2M-Cache-800-MHz-FSB,,169,135,1.2875,1.3875
2,"2014-11-17 00:01:26.732099","2014-11-17 00:01:26.732099",9,1,,1,,1,2,,3.6,,3600,,2,1,110,http://ark.intel.com/products/28019/64-bit-Intel-Xeon-Processor-3_60-GHz-2M-Cache-800-MHz-FSB,,169,135,1.25,1.388
3,"2014-11-17 00:01:26.754340","2014-11-17 00:01:26.754340",9,1,,1,,1,2,,3.80E,,3800,,1,1,110,http://ark.intel.com/products/27092/64-bit-Intel-Xeon-Processor-3_80E-GHz-2M-Cache-800-MHz-FSB,,169,135,1.2875,1.3875
4,"2014-11-17 00:01:26.796928","2014-11-17 00:01:26.796928",9,1,,2,,3,2,,,,3660,,2,1,110,http://ark.intel.com/products/27103/64-bit-Intel-Xeon-Processor-3_66-GHz-1M-Cache-667-MHz-FSB,,125,112,1.2875,1.4
5,"2014-11-17 00:01:26.847382","2014-11-17 00:01:26.847382",9,2,,3,,4,2,,310,,2130,,1,1,73,http://ark.intel.com/products/27104/Intel-Celeron-D-Processor-310-(256K-Cache-2_13-GHz-533-MHz-FSB),,125,112,1.25,1.4
6,"2014-11-17 00:01:26.868530","2014-11-17 00:01:26.868530",9,2,,3,,4,2,,315,,2260,,1,1,73,http://ark.intel.com/products/27105/Intel-Celeron-D-Processor-315-(256K-Cache-2_26-GHz-533-MHz-FSB),,125,112,1.25,1.4
7,"2014-11-17 00:01:26.888672","2014-11-17 00:01:26.888672",9,2,,3,,4,2,,315J,,2260,,1,1,73,http://ark.intel.com/products/27106/Intel-Celeron-D-Processor-315315J-(256K-Cache-2_26-GHz-533-MHz-FSB),,125,112,1.25,1.4
8,"2014-11-17 00:01:26.908313","2014-11-17 00:01:26.908313",9,2,,3,,4,2,,320,,2400,,1,1,73,http://ark.intel.com/products/27107/Intel-Celeron-D-Processor-320-(256K-Cache-2_40-GHz-533-MHz-FSB),,125,112,1.25,1.4
9,"2014-11-17 00:01:26.929046","2014-11-17 00:01:26.929046",9,2,,3,,4,2,,325,,2530,,1,1,73,http://ark.intel.com/products/27108/Intel-Celeron-D-Processor-325-(256K-Cache-2_53-GHz-533-MHz-FSB),,125,112,1.25,1.4
10,"2014-11-17 00:01:26.951261","2014-11-17 00:01:26.951261",9,2,,3,,4,2,,325J,,2530,,1,1,84,http://ark.intel.com/products/27110/Intel-Celeron-D-Processor-325J-(256K-Cache-2_53-GHz-533-MHz-FSB),,125,112,1.25,1.4
11,"2014-11-17 00:01:26.974090","2014-11-17 00:01:26.974090",9,2,,3,,4,2,,326,,2530,,1,1,84,http://ark.intel.com/products/27111/Intel-Celeron-D-Processor-326-(256K-Cache-2_53-GHz-533-MHz-FSB),,125,112,1.25,1.4
12,"2014-11-17 00:01:26.995734","2014-11-17 00:01:26.995734",9,2,,3,,4,2,,330,,2660,,1,1,73,http://ark.intel.com/products/27112/Intel-Celeron-D-Processor-330-(256K-Cache-2_66-GHz-533-MHz-FSB),,125,112,1.25,1.4
13,"2014-11-17 00:01:27.018634","2014-11-17 00:01:27.018634",9,2,,3,,4,2,,330J,,2660,,1,1,84,http://ark.intel.com/products/27114/Intel-Celeron-D-Processor-330J-(256K-Cache-2_66-GHz-533-MHz-FSB),,125,112,1.25,1.4
14,"2014-11-17 00:01:27.039080","2014-11-17 00:01:27.039080",9,2,,3,,4,2,,331,,2660,,1,1,84,http://ark.intel.com/products/27115/Intel-Celeron-D-Processor-331-(256K-Cache-2_66-GHz-533-MHz-FSB),,125,112,1.25,1.4
15,"2014-11-17 00:01:27.062386","2014-11-17 00:01:27.062386",9,2,,3,,4,2,,335J,,2800,,1,1,84,http://ark.intel.com/products/27118/Intel-Celeron-D-Processor-335J-(256K-Cache-2_80-GHz-533-MHz-FSB),,125,112,1.25,1.4
16,"2014-11-17 00:01:27.142737","2014-11-17 00:01:27.142737",9,2,,3,,4,2,,336,,2800,,1,1,84,http://ark.intel.com/products/27119/Intel-Celeron-D-Processor-336-(256K-Cache-2_80-GHz-533-MHz-FSB),,125,112,1.25,1.4
17,"2014-11-17 00:01:27.165279","2014-11-17 00:01:27.165279",9,2,,3,,4,2,,340,,2930,,1,1,73,http://ark.intel.com/products/27120/Intel-Celeron-D-Processor-340-(256K-Cache-2_93-GHz-533-MHz-FSB),,125,112,1.25,1.4
18,"2014-11-17 00:01:27.188150","2014-11-17 00:01:27.188150",9,2,,3,,4,2,,340J,,2930,,1,1,84,http://ark.intel.com/products/27121/Intel-Celeron-D-Processor-340J-(256K-Cache-2_93-GHz-533-MHz-FSB),,125,112,1.25,1.4
19,"2014-11-17 00:01:27.208418","2014-11-17 00:01:27.208418",9,2,,3,,4,2,,345,,3060,,1,1,73,http://ark.intel.com/products/27123/Intel-Celeron-D-Processor-345-(256K-Cache-3_06-GHz-533-MHz-FSB),,125,112,1.25,1.4
20,"2014-11-17 00:01:27.228689","2014-11-17 00:01:27.228689",9,2,,3,,4,2,,345J,,3060,,1,1,84,http://ark.intel.com/products/27124/Intel-Celeron-D-Processor-345J-(256K-Cache-3_06-GHz-533-MHz-FSB),,125,112,1.25,1.4
21,"2014-11-17 00:01:27.251224","2014-11-17 00:01:27.251224",9,2,,3,,4,2,,346,,3060,,1,1,84,http://ark.intel.com/products/27125/Intel-Celeron-D-Processor-346-(256K-Cache-3_06-GHz-533-MHz-FSB),,125,112,1.25,1.4
22,"2014-11-17 00:01:27.272903","2014-11-17 00:01:27.272903",9,2,,3,,4,2,,350,,3200,,1,1,73,http://ark.intel.com/products/27126/Intel-Celeron-D-Processor-350-(256K-Cache-3_20-GHz-533-MHz-FSB),,125,112,1.25,1.4
23,"2014-11-17 00:01:27.293688","2014-11-17 00:01:27.293688",9,2,,3,,4,2,,350J,,3200,,1,1,73,http://ark.intel.com/products/27127/Intel-Celeron-D-Processor-350350J-(256K-Cache-3_20-GHz-533-MHz-FSB),,125,112,1.25,1.4
24,"2014-11-17 00:01:27.316604","2014-11-17 00:01:27.316604",9,2,,3,,4,2,,351,,3200,,1,1,84,http://ark.intel.com/products/27128/Intel-Celeron-D-Processor-351-(256K-Cache-3_20-GHz-533-MHz-FSB),,125,112,1.25,1.4
25,"2014-11-17 00:01:27.338452","2014-11-17 00:01:27.338452",9,2,,3,,4,2,,355,,3330,,1,1,84,http://ark.intel.com/products/27130/Intel-Celeron-D-Processor-355-(256K-Cache-3_33-GHz-533-MHz-FSB),,125,112,1.25,1.4
26,"2014-11-17 00:01:27.381180","2014-11-17 00:01:27.381180",9,2,,4,,5,2,,356,,3330,,1,1,86,http://ark.intel.com/products/27131/Intel-Celeron-D-Processor-356-(512K-Cache-3_33-GHz-533-MHz-FSB),,188,81,1.25,1.325
27,"2014-11-17 00:01:27.400815","2014-11-17 00:01:27.400815",9,2,,4,,5,2,,360,,3460,,1,1,65,http://ark.intel.com/products/27132/Intel-Celeron-D-Processor-360-(512K-Cache-3_46-GHz-533-MHz-FSB),,188,81,1.25,1.325
28,"2014-11-17 00:01:27.442228","2014-11-17 00:01:27.442228",9,3,,5,,5,2,,310,,1200,,1,1,24.5,http://ark.intel.com/products/27138/Intel-Celeron-M-Processor-310-(512K-Cache-1_20-GHz-400-MHz-FSB),,77,83,,1.356
29,"2014-11-17 00:01:27.462385","2014-11-17 00:01:27.462385",9,3,,5,,5,2,,330,,1400,,1,1,24.5,http://ark.intel.com/products/27140/Intel-Celeron-M-Processor-330-(512K-Cache-1_40-GHz-400-MHz-FSB),,77,83,,1.356
30,"2014-11-17 00:01:27.484997","2014-11-17 00:01:27.484997",9,3,,5,,5,2,,340,,1500,,1,1,24.5,http://ark.intel.com/products/27141/Intel-Celeron-M-Processor-340-(512K-Cache-1_50-GHz-400-MHz-FSB),,77,83,,1.356
31,"2014-11-17 00:01:27.518347","2014-11-17 00:01:27.518347",9,3,,6,,3,2,,350,,1300,,1,1,21,http://ark.intel.com/products/27142/Intel-Celeron-M-Processor-350-(1M-Cache-1_30-GHz-400-MHz-FSB),,144,87,,1.26
32,"2014-11-17 00:01:27.538306","2014-11-17 00:01:27.538306",9,3,,6,,3,2,,360,,1400,,1,1,21,http://ark.intel.com/products/27143/Intel-Celeron-M-Processor-360-(1M-Cache-1_40-GHz-400-MHz-FSB),,144,87,,
33,"2014-11-17 00:01:27.558462","2014-11-17 00:01:27.558462",9,3,,6,,3,2,,380,,1600,,1,1,21,http://ark.intel.com/products/27146/Intel-Celeron-M-Processor-380-(1M-Cache-1_60-GHz-400-MHz-FSB),,144,87,1.004,1.292
34,"2014-11-17 00:01:27.580314","2014-11-17 00:01:27.580314",9,3,,6,,3,2,,390,,1700,,1,1,21,http://ark.intel.com/products/27147/Intel-Celeron-M-Processor-390-(1M-Cache-1_70-GHz-400-MHz-FSB),,144,87,1.004,1.292
35,"2014-11-17 00:01:27.611632","2014-11-17 00:01:27.611632",9,3,,7,,3,2,,410,,1460,,1,1,27,http://ark.intel.com/products/27148/Intel-Celeron-M-Processor-410-(1M-Cache-1_46-GHz-533-MHz-FSB),,151,90,1,1.3
36,"2014-11-17 00:01:27.632430","2014-11-17 00:01:27.632430",9,3,,7,,3,2,,420,,1600,,1,1,27,http://ark.intel.com/products/27149/Intel-Celeron-M-Processor-420-(1M-Cache-1_60-GHz-533-MHz-FSB),,151,90,1,1.3
37,"2014-11-17 00:01:27.656233","2014-11-17 00:01:27.656233",9,3,,7,,3,2,,430,,1730,,1,1,27,http://ark.intel.com/products/27150/Intel-Celeron-M-Processor-430-(1M-Cache-1_73-GHz-533-MHz-FSB),,151,90,1,1.3
38,"2014-11-17 00:01:27.676503","2014-11-17 00:01:27.676503",9,3,,7,,3,2,,450,,2000,,1,1,27,http://ark.intel.com/products/27152/Intel-Celeron-M-Processor-450-(1M-Cache-2_00-GHz-533-MHz-FSB),,151,90,1,1.3
39,"2014-11-17 00:01:27.695606","2014-11-17 00:01:27.695606",9,3,,5,,5,2,,333,,900,,1,1,7,http://ark.intel.com/products/27156/Intel-Celeron-M-Processor-ULV-333-(512K-Cache-900-MHz-400-MHz-FSB),,77,83,,1.004

and several thousand additional lines of this sort