Since your question is about storing, maintaining and referencing a large set of exercises (potentially in the order of 10,000), I'm going to concentrate on that, so the style here is very basic.
It's possible to define conditionals using \newif (or through commands provided by packages such as etoolbox). For example:
\newif\ifsolutions
\newif\ifcomplete
These default to false, but can be switched on:
\solutionstrue
\completetrue
It's also useful to provide syntactic commands to mark the solution. For example:
\newcommand{\solutionname}{Solution}
\newcommand{\solution}{\par\textbf{\solutionname}:\par}
As has been mentioned in one of the other answers, it's also possible to use environments and the comment package. For multilingual support, the caption hooks can be used to redefine \solutionname as appropriate. For example:
\usepackage[USenglish]{babel}
\addto\captionsUSenglish{%
\renewcommand\solutionname{Solution}%
}
Now an exercise can be written using these commands. For example:
$y = \sin(2x)$
\ifsolutions
\solution
\ifcomplete
Intermediate steps, further details etc.
\fi
$y' = 2\cos(2x)$
\fi
Environments provide a more LaTeXy feel, but let's concentrate on storing and accessing the questions.
The simple method, which has already been suggested, is to put each question in a separate file and load it with \input. For example, if this exercise is in the file exercises/calculus/easy/dsin.tex then the following MWE works:
\documentclass{article}
\newif\ifsolutions
\newif\ifcomplete
\solutionstrue
\completetrue
\newcommand{\solutionname}{Solution}
\newcommand{\solution}{\par\textbf{\solutionname}:\par}
\begin{document}
\begin{enumerate}
\item \input{exercises/calculus/easy/dsin}
\end{enumerate}
\end{document}
This is a relatively generic method, which can easily be translated to other TeX formats. For example, the Plain TeX equivalent is:
\newif\ifsolutions
\newif\ifcomplete
\solutionstrue
\completetrue
\def\solutionname{Solution}
\long\def\solution{\par{\bf\solutionname}:\par}
\newcount\questionnum
\long\def\question{%
\par
\advance\questionnum by 1\relax
\number\questionnum.
}
\question \input exercises/calculus/easy/dsin
\bye
The problem is that, although this structure is fine for a small number of questions, it can become unmanageable for 10,000. I mentioned datatooltk in the comments, which can read and write .dbtex files (datatool's internal format), but I don't recommend using this format directly. These files just contain LaTeX code that defines the internal registers and control sequences used by datatool to store the required data. There's no compression and it takes up a huge amount of resources. The datatooltk application works better as an intermediary that can pull filtered, shuffled or sorted data from external sources in a way that can easily be input in the document. (See the datatool performance page that compares build times for large databases.)
There are switches, such as --shuffle or --sort, which instructs datatooltk to shuffle or sort the data after it's been pulled from the data source. This uses Java, which is more efficient than TeX, but if the data is stored in a SQL database, it's even more efficient to include these steps in the actual --sql switch. (Currently, datatooltk is only configured for MySQL, but it may be possible to use something else if the necessary .jar file can be added to the class path.)
SQL databases can be optimized to improve performance. Suppose you want to randomly select 20 questions from 500. How do you perform that selection in LaTeX? First you'd need to use the shell to find out all the available files (or have an index file that can be parsed). Then you need to shuffle the list. That will take a while to do with TeX. It's more efficient to do this with SQL. (See, for example, MySQL select 10 random rows from 600K rows fast.)
If you decide to use SQL, the next thing to consider is the table structure.
- You'll need a unique id field. With this you'll be able to specifically select certain questions rather than have a random selection. (An auto increment primary key is best.)
- A field containing the question. (Let's call it
Question.)
- A field containing the brief answer. (Let's call it
Answer.)
- A field containing the extended answer. (Let's call it
ExtendedAnswer.)
- A field identifying the difficulty level. (Let's call it
Level.) This could be an integer (1 = easy) or an enumeration (easy, medium, hard).
- A field identifying the topic. (Let's call it
Topic.) An enumeration is probably the simplest type (for example, calculus, settheory).
I'm not quite sure about the language. There are two approaches that I can think of: have fields for the other language (For example, QuestionPortuges, AnswerPortuges and ExtendedAnswerPortuges) or have a separate entry for the question in a different language with an extra field for the language.
So the above exercise example, could have
Question => $y = \sin(2x)$
Answer => $y' = 2\cos(2x)$
ExtendedAnswer => Intermediate steps, further details etc. \[y' = 2\cos(2x)\]
Level => 1
Topic => calculus
Language => english or ExtendedAnswerPortuges => Passos intermédios, etc. \[y' = 2\cos(2x)\]
Note that this doesn't include the syntactic command \solution or the conditionals \ifsolutions and \ifcomplete, which makes it easier to arrange the various parts of the question and answer.
It may be that some exercises require a particular package (such as amsmath or graphicx), so perhaps there could also be a field for the required packages. For example Packages => graphicx,amsmath.
Any images or verbatim text must be stored outside the database somewhere on the file system. They could be on TeX's path or the database table could have a field with a list of external resources or the question/answer could simply use the full path.
The datatooltk call can be done before the LaTeX run or using the shell escape. There's also a datatooltk rule for arara users. Let's suppose, I use datatooltk to pull a random selection of questions and save the results in a file called exercises.dbtex. This can then be loaded in the document using:
\DTLloaddbtex{\exercisedb}{exercises.dbtex}
If the data includes the Packages field, you can make sure all the required packages are loaded by adding the following to the preamble:
\DTLforeach*{\exercisedb}{\Packages=Packages}
{\DTLifnullorempty{\Packages}{}{\usepackage{\Packages}}}
In the main part of the document:
\begin{enumerate}
\DTLforeach*{\exercisedb}% data base
{\Question=Question,\Answer=Answer,\ExtendedAnswer=ExtendedAnswer}% assignment list
{%
\item \Question
\ifsolutions
\solution
\ifcomplete
\ExtendedAnswer
\else
\Answer
\fi
\fi
}
\end{enumerate}
Further reading: Using the datatool Package for Exams or Assignment Sheets
datatooltkin batch mode). – Nicola Talbot Sep 29 '16 at 08:14problems,problems/algebra,problems/algebra/beginners,problems/algebra/cuttingedge,problems/analysis,problems/analysis/complex/intermediateetc. There really isn't a need for a database unless you want to combine questions in different ways such that there are no natural joints. But here the divisions are just that natural ones. [If you cover several topics in a course, you might prefer to have level be the higher directory level and topic the lower one. Otherwise, it probably makes no odds.] – cfr Sep 30 '16 at 01:21@EASEpackage? The short description:@EASEstands for AcroTeX Exam Assembly System Environment.@EASEallows educators to assemble a database of questions. With the@EASEcontrol panel, the educator opens appropriate database files (PDF-files), selects questions of interest to builds an exam, which is a LaTeX source file, consisting of the questions selected.@EASErequires Acrobat Pro 7.0 or later to execute some JavaScript not available in Adobe Reader. Disclaimer: I've not used it. – Ross Sep 30 '16 at 08:25\newif\ifcompleteand then just have\ifcomplete extra stuff\fiwithin the answer. Alternatively, with the database approach have an extra field for the complete answer (depends how different the complete answer is from the abridged answer). – Nicola Talbot Sep 30 '16 at 10:38