Preprocessor Issues

connor's Avatar

connor

21 May, 2018 11:21 PM

Hey Brett!

I've been toying with my setup of a Preprocessor that outputs a document with pandoc (per your suggestion here), and run into a bug, and also an issue that I don't think is necessarily a Marked bug, but I was hoping you had some thoughts on how to resolve it.

The first has to do with setting the preprocessor and passing it parameters in a per-document way, per here. When I enable the preprocessor simply as a boolean (Custom Preprocessor: true), it works as expected. However, when I try to pass JSON-style parameters to the preprocessor:

Custom Preprocessor: [true, "pdf"]

a few things go wrong. First, that per document setting line appears at the top of the document, even when running the default Marked processor. In addition, if a custom processor is enabled in settings (in addition to the preprocessor), even if it is not enabled for this document, after rendering the document a second time (put the above JSON line in the document, save, then refresh Marked again to force an additional render), the Custom Processor is enabled, and will not disable unless you either disable it globally in the Preferences or change the line in the document. I believe this is a bug on your end, though I'm not entirely sure.

Second, I originally took the script you put together for Custom Export Options and adapted it to have pandoc build a docx file (solving an earlier issue I had). It works swimmingly. But then I tried to adapt it to accept an output extension parameter, so I could render documents to PDF via LaTeX as well. The modification to pass a parameter works fine (other than revealing the bug above), but for whatever reason, the script does not execute pandoc to render a PDF via LaTeX. Running the pandoc command directly works, as does calling the script manually (with the paths hardcoded since the Marked environment variables wouldn't exist). Any ideas why pandoc is happy to build me a PDF when I ask it directly, and happy to build a docx when called via the Preprocessor script, but unwilling to generate a PDF via preprocessor? Note that it does not fail or throw an error (that I can see), it just never generates a file.

I attached a test file that results in the bug above. The preprocessor script is as follows:

#!/bin/bash

function main()
{
    local filename=$(basename "$MARKED_PATH")
    local filebasename="${filename%.*}"
    # filebasename is the base name of the file only, with the extension removed

    local desiredextension="${1:-docx}"
    local outputfile="$MARKED_ORIGIN$filebasename.$desiredextension"

    ~/anaconda3/bin/pandoc -f markdown "$MARKED_PATH" -o "$outputfile"

    echo "NOCUSTOM"
}

cat | main $@

It also fails to generate a PDF even if the desired extension is hardcoded to local desiredextension="pdf" in line 9, rather than taking it as an argument.

Any ideas?

Thanks!

Connor

  1. Support Staff 1 Posted by Brett on 31 May, 2018 12:38 PM

    Brett's Avatar

    Connor, sincerely sorry for the delayed reply. I looked at this and realized there was a lot to figure out here and put it off a bit, then somehow the todo item I added disappeared and it slipped through the cracks.

    Found it in my review, will add it back to the list to dig into this week.

    Thanks,
    Brett

  2. 2 Posted by connor on 31 May, 2018 06:25 PM

    connor's Avatar

    No worries, it happens :) It's not like I threw you a softball, anyway

  3. Support Staff 3 Posted by Brett on 11 Jun, 2018 03:07 PM

    Brett's Avatar

    Hey Connor,

    Sorry again for the delay. I recently (v2.5.18) made a change to the custom processor JSON handler that may help with the issue passing "pdf" to the custom processor as an argument. Let me know if you see any difference when using the same settings you have now.

    First, that per document setting line appears at the top of the document, even when running the default Marked processor.

    If you're using the GFM processor, it doesn't recognize metadata and just passes it through. Marked will recognize it if you wrap it in HTML comments and that will hide it from the output.

    <!--
    custom preprocessor: true
    -->
    

    Alternatively, under Advanced prefs you can check "Strip MMD Metadata." Preprocessors are run prior to the stripping, so the Custom Processor: line would still be read, but removed from the final output.

    In addition, if a custom processor is enabled in settings (in addition to the preprocessor), even if it is not enabled for this document, after rendering the document a second time (put the above JSON line in the document, save, then refresh Marked again to force an additional render), the Custom Processor is enabled, and will not disable unless you either disable it globally in the Preferences or change the line in the document. I believe this is a bug on your end, though I'm not entirely sure.

    If custom processor is set to true in the metadata, it will always enable the custom processor, there's no UI override, you need to change the setting in the metadata.

    The modification to pass a parameter works fine (other than revealing the bug above), but for whatever reason, the script does not execute pandoc to render a PDF via LaTeX. Running the pandoc command directly works, as does calling the script manually (with the paths hardcoded since the Marked environment variables wouldn't exist). Any ideas why pandoc is happy to build me a PDF when I ask it directly, and happy to build a docx when called via the Preprocessor script, but unwilling to generate a PDF via preprocessor? Note that it does not fail or throw an error (that I can see), it just never generates a file.

    I have not had a chance to test your script yet, but I would recommend using a full path to pandoc, e.g. /Users/connor/anaconda3/bin/pandoc instead of the tilde shortcut. I also wonder if there's any environment setup loaded in your shell that Marked might be missing when it shells out. I'm not sure what Anaconda does, but if it loads a specific environment, keep in mind that it won't be available to the non-user shell that Marked executes the script in.

    You might also want to suppress any output from the pandoc comand with:

    ~/anaconda3/bin/pandoc -f markdown "$MARKED_PATH" -o "$outputfile" 2&> /dev/null
    

    and then ensure a success return with return 0 after the NOCUSTOM in the main function.

    For debugging, you can do something simple like echo out the command itself and make sure that it's executing what you think it is:

    echo "<pre><code>~/anaconda3/bin/pandoc -f markdown \"$MARKED_PATH\" -o \"$outputfile\"</code>
    

    "

    (The HTML tags will make sure you see the full output in the preview window.)

    If all that looks good, let me know and I'll dig in and try some local testing. You might also look at using the tee command, though I doubt it would solve the current issues. It would only be useful if you wanted to output the pandoc results to HTML (to Marked) and PDF at the same time.

  4. 4 Posted by connor on 11 Jun, 2018 09:38 PM

    connor's Avatar

    Okay, I updated to the latest Version 2.5.18 (952). I got some interesting results.

    1. Without enclosing Custom Preprocessor: [true, "pdf"] in an HTML comment, both the MMD and GFM processors include that text at the top of the rendering. When I then wrap it in a comment, the GFM processor does not display it, but the MMD processor still does. Checking the "Strip MMD3 Metadata headers" has no effect for the GFM processor, but for the MMD processor, when the metadata line is wrapped in an HTML comment, it reformats but still displays the text ("Custom Preprocessor" becomes "custompreprocessor"). Not sure what to make of all this.

    2. Re: the UI override, that isn't quite the issue. The issue is I'm enabling the PREprocessor via metadata, which is also force enabling the custom PROCESSOR. What you described is the expected behavior, but I'm getting BOTH the preprocessor and custom processor enabled when the metadata have only enabled the preprocessor.

    3. Re: the script, I reviewed the changes you suggested, to no avail. I removed the tilde shortcut from the pandoc path. I also switched to the plain-jane installation of pandoc on my machine, /usr/local/bin/pandoc, instead of the one brought in by anaconda. Not sure if that will handle any environment variables or other issues introduced by anaconda, but I gave it a shot. I also suppressed output with &> /dev/null (I presumed 2&> was a typo), and added a return 0 at the end of the function. The function is calling pandoc correctly as far as I can tell from the echo. None of the above solved the problem when trying to generate a PDF, though creation of a .docx file continues to work (except when I tried 2&> before realizing it should be &>).

    Is it possible that Marked's non-user shell environment does not have access to LaTeX that is called by pandoc, or something like that?

    Thanks for all the help!

    Connor

  5. Support Staff 5 Posted by Brett on 12 Jun, 2018 09:03 PM

    Brett's Avatar

    Ok, this is a lot going wrong. I don't have time in the immediate future
    to dig into this as higher priority than a couple of other issues, but
    I'm making notes to tackle as soon as I can free up the space to do so.
    Is any of this immediate high priority/showstopper in your workflow?

    -Brett

  6. 6 Posted by connor on 12 Jun, 2018 09:05 PM

    connor's Avatar

    None are showstoppers; all of them have easy workarounds. Prioritize as you see fit :)

    Thanks!

    Connor

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Already uploaded files

  • example.txt 82 Bytes

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac