Slide Title Extractor Part 1 of 3
This is a three part series about how I went about solving a very small computer problem.
I wanted to extract the text within the \frametitle{} tag in beamer (a latex slide making platform).
Example:
\frame{
\frametitle{Notes}
\begin{itemize}
\item Note 1
\end{itemize}
}
Where I would pull out the title "Notes".
My first attempt was a bash script using sed, tr, and more sed.
#!/bin/bash
#remove comments and get mostly frametitles
sed -n -f ste.sed $1 > $1.1
#remove line endings
tr "\n" " " < $1.1 > $1.2
#make new line endings
tr "}" "\n" < $1.2 > $1.3
#remove any lines with latex commands left
sed -n '/\\frametitle{/p' $1.3 > $1.4
#fix spacing
sed 's/[ ]* / /g' $1.4 > $1.5
#remove the \framtitles and put in periods and spaces
sed 's/ \\frametitle{//;s/$/./' $1.5 > $1.6
#remove line endings
tr "\n" " " < $1.6 > $1.7
cp $1.7 $1_summary.txt
rm $1.*
where ste.sed was:
/^%/d
/\\frametitle/{
N
s/.*\n}/ /
/\\frametitle.*}/p
}
Which I thought was inelegant. So I decided to rewrite it, which is part 2 and 3 of this series.
If anyone knows the regex to get it to work without all of my gymnastics, I would love to see it.
No comments:
Post a Comment