2

I'm generating a pdf from a markdown file that begins this way :

---
header-includes:
- \usepackage{fvextra}
- \DefineVerbatimEnvironment{Highlighting}{Verbatim}{breaklines, breakanywhere, commandchars=\\\{\}}

title: "Geomatique" subtitle: "web mapping" author: Marc Le Bihan geometry: margin=2cm fontsize: 11pt classoption: fleqn urlcolor: blue


Installation de l'environnement

wget https://download.geofabrik.de/europe/france/midi-pyrenees-latest-free.shp.zip
wget https://www.data.gouv.fr/fr/datasets/contours-des-departements-francais-issus-d-openstreetmap/
wget https://www.data.gouv.fr/fr/datasets/decoupage-administratif-communal-francais-issu-d-openstreetmap/

When I'm generating it with a pandoc command :

pandoc geomatique.md -o geomatique.pdf

I stumble upon lines not completely broken, even if breakanywhere is sat.

enter image description here

You can see the arrow → and the line break at space just after wget, that shows that Pandoc has taken into account the Verbatim directive, but the second part of the command isn't separated correctly.

and if I copy and paste some text on the generated pdf, it appears incomplete, some part being lost :

wget https://www.data.gouv.fr/fr/datasets/decoupage-administratif-communal-francais-issu-d-openstr

instead of

wget https://www.data.gouv.fr/fr/datasets/decoupage-administratif-communal-francais-issu-d-openstreetmap/

@Marijin : here's what it generates in a .tex file :

% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
\PassOptionsToPackage{dvipsnames,svgnames*,x11names*}{xcolor}
%
\documentclass[
  11pt,
  fleqn]{article}
\usepackage{amsmath,amssymb}
\usepackage{lmodern}
\usepackage{iftex}
\ifPDFTeX
  \usepackage[T1]{fontenc}
  \usepackage[utf8]{inputenc}
  \usepackage{textcomp} % provide euro and other symbols
\else % if luatex or xetex
  \usepackage{unicode-math}
  \defaultfontfeatures{Scale=MatchLowercase}
  \defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
\fi
% Use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
  \usepackage[]{microtype}
  \UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
  \IfFileExists{parskip.sty}{%
    \usepackage{parskip}
  }{% else
    \setlength{\parindent}{0pt}
    \setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
  \KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\IfFileExists{bookmark.sty}{\usepackage{bookmark}}{\usepackage{hyperref}}
\hypersetup{
  pdftitle={Geomatique},
  pdfauthor={Marc Le Bihan},
  colorlinks=true,
  linkcolor={Maroon},
  filecolor={Maroon},
  citecolor={Blue},
  urlcolor={blue},
  pdfcreator={LaTeX via pandoc}}
\urlstyle{same} % disable monospaced font for URLs
\usepackage[margin=2cm]{geometry}
\usepackage{color}
\usepackage{fancyvrb}
\newcommand{\VerbBar}{|}
\newcommand{\VERB}{\Verb[commandchars=\\\{\}]}
\DefineVerbatimEnvironment{Highlighting}{Verbatim}{commandchars=\\\{\}}
% Add ',fontsize=\small' for more characters per line
\newenvironment{Shaded}{}{}
\newcommand{\AlertTok}[1]{\textcolor[rgb]{1.00,0.00,0.00}{\textbf{#1}}}
\newcommand{\AnnotationTok}[1]{\textcolor[rgb]{0.38,0.63,0.69}{\textbf{\textit{#1}}}}
\newcommand{\AttributeTok}[1]{\textcolor[rgb]{0.49,0.56,0.16}{#1}}
\newcommand{\BaseNTok}[1]{\textcolor[rgb]{0.25,0.63,0.44}{#1}}
\newcommand{\BuiltInTok}[1]{#1}
\newcommand{\CharTok}[1]{\textcolor[rgb]{0.25,0.44,0.63}{#1}}
\newcommand{\CommentTok}[1]{\textcolor[rgb]{0.38,0.63,0.69}{\textit{#1}}}
\newcommand{\CommentVarTok}[1]{\textcolor[rgb]{0.38,0.63,0.69}{\textbf{\textit{#1}}}}
\newcommand{\ConstantTok}[1]{\textcolor[rgb]{0.53,0.00,0.00}{#1}}
\newcommand{\ControlFlowTok}[1]{\textcolor[rgb]{0.00,0.44,0.13}{\textbf{#1}}}
\newcommand{\DataTypeTok}[1]{\textcolor[rgb]{0.56,0.13,0.00}{#1}}
\newcommand{\DecValTok}[1]{\textcolor[rgb]{0.25,0.63,0.44}{#1}}
\newcommand{\DocumentationTok}[1]{\textcolor[rgb]{0.73,0.13,0.13}{\textit{#1}}}
\newcommand{\ErrorTok}[1]{\textcolor[rgb]{1.00,0.00,0.00}{\textbf{#1}}}
\newcommand{\ExtensionTok}[1]{#1}
\newcommand{\FloatTok}[1]{\textcolor[rgb]{0.25,0.63,0.44}{#1}}
\newcommand{\FunctionTok}[1]{\textcolor[rgb]{0.02,0.16,0.49}{#1}}
\newcommand{\ImportTok}[1]{#1}
\newcommand{\InformationTok}[1]{\textcolor[rgb]{0.38,0.63,0.69}{\textbf{\textit{#1}}}}
\newcommand{\KeywordTok}[1]{\textcolor[rgb]{0.00,0.44,0.13}{\textbf{#1}}}
\newcommand{\NormalTok}[1]{#1}
\newcommand{\OperatorTok}[1]{\textcolor[rgb]{0.40,0.40,0.40}{#1}}
\newcommand{\OtherTok}[1]{\textcolor[rgb]{0.00,0.44,0.13}{#1}}
\newcommand{\PreprocessorTok}[1]{\textcolor[rgb]{0.74,0.48,0.00}{#1}}
\newcommand{\RegionMarkerTok}[1]{#1}
\newcommand{\SpecialCharTok}[1]{\textcolor[rgb]{0.25,0.44,0.63}{#1}}
\newcommand{\SpecialStringTok}[1]{\textcolor[rgb]{0.73,0.40,0.53}{#1}}
\newcommand{\StringTok}[1]{\textcolor[rgb]{0.25,0.44,0.63}{#1}}
\newcommand{\VariableTok}[1]{\textcolor[rgb]{0.10,0.09,0.49}{#1}}
\newcommand{\VerbatimStringTok}[1]{\textcolor[rgb]{0.25,0.44,0.63}{#1}}
\newcommand{\WarningTok}[1]{\textcolor[rgb]{0.38,0.63,0.69}{\textbf{\textit{#1}}}}
\usepackage{graphicx}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
% Set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
  \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{-\maxdimen} % remove section numbering
\usepackage{fvextra}
\DefineVerbatimEnvironment{Highlighting}{Verbatim}{breaklines, breakanywhere, commandchars=\\\{\}}
\ifLuaTeX
  \usepackage{selnolig}  % disable illegal ligatures
\fi

\title{Geomatique} \usepackage{etoolbox} \makeatletter \providecommand{\subtitle}[1]{% add subtitle to \maketitle \apptocmd{@title}{\par {\large #1 \par}}{}{} } \makeatother \subtitle{web mapping} \author{Marc Le Bihan} \date{}

\begin{document} \maketitle

\hypertarget{installation-de-lenvironnement}{% \section{Installation de l'environnement}\label{installation-de-lenvironnement}}

\begin{Shaded} \begin{Highlighting}[] \FunctionTok{wget}\NormalTok{ https://download.geofabrik.de/europe/france/midi{-}pyrenees{-}latest{-}free.shp.zip} \FunctionTok{wget}\NormalTok{ https://www.data.gouv.fr/fr/datasets/contours{-}des{-}departements{-}francais{-}issus{-}d{-}openstreetmap/} \FunctionTok{wget}\NormalTok{ https://www.data.gouv.fr/fr/datasets/decoupage{-}administratif{-}communal{-}francais{-}issu{-}d{-}openstreetmap/} \end{Highlighting} \end{Shaded}

\hypertarget{satellite-sentinel-2a-ou-b}{% \subsection{Satellite Sentinel 2A ou B}\label{satellite-sentinel-2a-ou-b}}

\begin{itemize} \tightlist \item 13 Bandes spectrales du visible à l'infrarouge court (SWIR), en passant par le proche infrarouge (PIR). \item Résolution spatiale : 10 à 60 mètres \item Emprise de 290 kilomètres de large \item Projection UTM 31 / EPSG 32631 \item Fréquence de balayage : tous les cinq jours \end{itemize}

Accessible sur \href{https://scihub.copernicus.eu/dhus/}{Copernicus EU}

Après s'être créé un compte gratuit, télécharger ces données en entrant dans la barre de recherche : (Données \emph{Sentinel 2}, Toulouse) :

\begin{itemize} \tightlist \item \texttt{S2A_MSIL1C_20170705T105031_N0205_R051_T31TCJ_20170705T105605.SAFE} \item \texttt{S2A_MSIL1C_20170215T105121_N0204_R051_T31TCJ_20170215T105607.SAFE} \item \texttt{S2A_MSIL1C_20170824T105031_N0205_R051_T31TCJ_20170824T105240.SAFE} : ce dernier est offline. \end{itemize}

Note : Copernicus permet de faire des requêtes par type de satellite, zone visée, date, couverture nuageuse, qualité du post-traitement.

\hypertarget{compruxe9hension-dune-image-satellite}{% \section{Compréhension d'une image satellite}\label{compruxe9hension-dune-image-satellite}}

D'après \href{https://cms.geobretagne.fr/content/comprendre-une-image-satellitaire}{Comprendre une image satellitaire}

\hypertarget{bandes-spectrales-ou-canaux}{% \subsection{Bandes spectrales (ou canaux)}\label{bandes-spectrales-ou-canaux}}

\begin{itemize} \item \textbf{Bandes} : rayons gamma, rayons X, ultra-violet, bleu, vert, rouge, Proche infra-rouge (PIR), infra-rouge, micro-ondes, ondes radio. \item \textbf{Fréquence} : de haute à basse \item \textbf{Longueur d'onde} : faible à grande \end{itemize}

L'on en produit des images de type raster. où \textbf{les valeurs des bandes sont représentées en niveaux de gris}. et traduisent traduisent la \textbf{réflectance} ou \textbf{réflexivité}.

La \textbf{résolution spectrale} dit sur un rayon de combien de mètres porte la valeur d'un pixel : en contrepartie d'une résolution fine, les fenêtres des capteurs seront étroits.

Si l'on réassocie les bandes bleues, vertes et rouges ensemble, par simple synthèse additive des trois couleurs, on réobtient \textbf{l'image en vraies couleurs}.

Mais, pour voir aussi l'infra-rouge ou d'autres bandes non visibles à l'oeil nu, l'on recoure fréquemment à une \textbf{image en fausses couleurs} dépendant de ce que l'on veut mettre en évidence (exemple : mieux discriminer les surfaces végétales ou minérales), où en contrepartie, l'on ne fait pas apparaître la bande du bleu.

\hypertarget{ruxe9flectance-caractuxe9ristiques}{% \subsection{Réflectance caractéristiques}\label{ruxe9flectance-caractuxe9ristiques}}

\hypertarget{vuxe9guxe9tation}{% \subsubsection{Végétation}\label{vuxe9guxe9tation}}

\begin{itemize} \item Infrarouge haut \item Rouge et vert bas \item Sa réflectance caractéristique montre des sauts dans le domaine visible (0.4 à 0.7 µm) et le proche infrarouge (1 µm) \item L'indice \textbf{\textcolor{blue}{Normalized Difference Vegetation Index} (NVDI)} met cette végétation en évidence. [ NVDI = \frac{PIR - R}{PIR + R} ] \end{itemize}

\hypertarget{eau}{% \subsubsection{Eau}\label{eau}}

\begin{itemize} \tightlist \item Basse pour toutes les bandes : elle dépend des matières en suspension qu'elle transporte. \end{itemize}

\begin{figure} \centering \includegraphics{./geomatique_webmapping/signatures_spectrales_combinees.png} \caption{Signatures spectrales combinées} \end{figure}

\end{document}

  • The Highlighting environment would not color syntax. – egreg Jun 27 '21 at 08:45
  • Can you show the LaTeX code that is generated by Pandoc (pandoc geomatique.md -o geomatique.tex)? – Marijn Jun 27 '21 at 08:54
  • @Marijn I edited my post to add the information. – Marc Le Bihan Jun 27 '21 at 10:04
  • @egreg I copy pasted the whole directive from another StackExchange post without truely understanding the meaning of Highlighting. If I try to remove it, this way - \DefineVerbatimEnvironment{Verbatim}{breaklines, breakanywhere, commandchars=\\\{\}} my Pandoc command fails with an ! LaTeX Error: Fileselnolig.sty' not found.later, meaning that I do not master how to write thatDefineVerbatimEnvironment` directive. – Marc Le Bihan Jun 27 '21 at 10:08
  • @MarcLeBihan thanks for the LaTeX code, however this is not the full file. It is important to add all the code to the question, because the definitions of the commands used in the snippet, which are located in other parts of the file, are needed to understand and modify the final output. – Marijn Jun 27 '21 at 11:25
  • @Marijn Sorry. Here is it. – Marc Le Bihan Jun 27 '21 at 13:40
  • @MarcLeBihan the new code is longer, but it still doesn't seem complete. For example in your output it shows the title (Geomatique), the subtitle (web mapping) and your name. This information is not present in the code that you posted. Also, more importantly, the definitions of the environments are still missing. – Marijn Jun 27 '21 at 14:28
  • @Marijin but that is really the complete output of the command pandoc geomatique.md -o geomatique.tex I have done. Maybe additional parameters should be added to the pandoc command to have better (or other) outputs? – Marc Le Bihan Jun 27 '21 at 17:42
  • @MarcLeBihan you are right, my apologies. Pandoc outputs only partial LaTeX using the default settings. The full code can be obtained with pandoc -s geomatique.md -o geomatique.tex, so adding -s as a command line option. Could you add that to the question? Thanks :) – Marijn Jun 27 '21 at 18:03
  • 1
    @Marijn ok. I have refreshed the tex code. – Marc Le Bihan Jun 27 '21 at 18:56

2 Answers2

1

The lack of line breaks is caused by the way Pandoc implements syntax coloring. This is done by wrapping each token in the output in a \XXXTok command, with a long list of different types of tokens that are printed in specific colors. The urls in the question are a single token that get wrapped into a \NormalTok command:

\FunctionTok{wget}\NormalTok{ https://download.geofabrik.de/europe/france/midi-pyrenees-latest-free.shp.zip}

The \NormalTok command is defined as:

\newcommand{\NormalTok}[1]{#1}

So it just returns the argument without coloring. However, the grouping prevents fvextra from performing line breaks within a token. This doesn't lead to problems when the tokens are short, which is typically the case for programming code, but with long tokens such as urls it does not work.

Ideally the solution would be to switch off the generation of \NormalTok somehow in Pandoc, given that it does not actually do anything and it causes these issues, however that does not seem to be possible.

Alternatively you can redefine \NormalTok to something that does break automatically. One possibility is to use \texttt with the hyphenat package, as described in How to get long \texttt sections to break. Note that this only works if there are hyphens in the lines, but that is the case in the example.

You can also switch off the arrow character and the indentation using the options breaksymbol= and breakanywheresymbolpre=.

MWE:

---
header-includes:
- \usepackage{fvextra}
- \usepackage[htt]{hyphenat}
- \let\NormalTok\texttt
- \DefineVerbatimEnvironment{Highlighting}{Verbatim}{breaklines, breakanywhere, breaksymbol=, breakanywheresymbolpre=, commandchars=\\\{\}}

title: "Geomatique"
subtitle: "web mapping"
author: Marc Le Bihan
geometry: margin=2cm
fontsize: 11pt
classoption: fleqn
urlcolor: blue
---

# Installation de l'environnement

```bash
wget https://download.geofabrik.de/europe/france/midi-pyrenees-latest-free.shp.zip
wget https://www.data.gouv.fr/fr/datasets/contours-des-departements-francais-issus-d-openstreetmap/
wget https://www.data.gouv.fr/fr/datasets/decoupage-administratif-communal-francais-issu-d-openstreetmap/
```

Result:

enter image description here

Note that when you copy the code from the pdf the line break is also copied, which means you cannot paste it directly into a terminal, i.e., you first need to remove the line break by pasting it in a text editor window for example. However, this is unavoidable with pdf output I think.

Marijn
  • 37,699
1

fvextra v1.5 (2022/11/30) adds a new option breaknonspaceingroup that inserts line breaks within groups. This can be used to enable breaks within Pandoc's token macros.

---
header-includes:
- |
  ```{=latex}
  \usepackage{fvextra}
  \DefineVerbatimEnvironment{Highlighting}{Verbatim}{
    commandchars=\\\{\},
    breaklines, breaknonspaceingroup, breakanywhere}
  ```
---

Example line breaks

G. Poore
  • 12,417
  • Very interesting new feature! But I wonder : how may I check that my latex inclusions like this fextra are up to date, on my Linux computer, and if they aren't, provoke their upgrade? – Marc Le Bihan Dec 01 '22 at 16:24
  • @MarcLeBihan If you use the latest TeX Live, you can just have the package manager update everything. – G. Poore Dec 01 '22 at 22:55