Marked: Discussion

Problems with custom preprocessor

2017-04-25T19:26:49Z

First off, are your scripts taking STDIN input? i.e., in a bash script,
you would want to cat|tee to ensure the STDIN continues. Assuming
that's all true, if you run cat testfile.md|yourscript.py on the
command line does it finish immediately? And is your script writing out
incrementally or all at once at the end?

It needs to pick up the STDIN
Nothing should be written out (STDOUT) until the script is complete
and ready to exit with a success code of zero
Writing to STDERR should be avoided unless there's actually an error

You can see the STDOUT/STDERR output of the running script in Marked
using Help->Show Custom Processor Log

As far as size, no, there's not an inherent limit, and definitely not
under 500k.

-Brett

Problems with custom preprocessor

2017-04-26T04:07:17Z

Hi Brett! Thanks for the fast reply!

Yeah, the scripts take STDIN, write to STDOUT and they work correctly on the command line. Classic UNIX filter scripts.

I guess the culprit is your point 2. I didn't know that I couldn't work through the data incrementally (I have not seen anything about that neither in the Marked help nor any of your articles dealing with custom pre-processors). My script reads a line from STDIN, check if it needs special handling and, if not, write it to STDOUT, reads the next line from STDIN, etc.

I suppose that at this particular size the system buffering of STDOUT is full and gets flushed and thus Marked sees data coming from the pre-processor and stops sending and switches to reading instead?

I'll try to rewrite the script so that it gobbles the entirety of STDIN into memory and then work through that instead. I just thought that would unnecessarily slow things down but, at the sizes of normal documents, I guess it could be negligible.

/Paul

Problems with custom preprocessor

2017-04-26T04:35:44Z

If your script closes the STDOUT file handle in between writes, then
yes, the system will report that it's done writing after a couple of
microseconds. It's best to collect the output in a variable then write
it out at the end. Let me know if that works for you.

-Brett

Problems with custom preprocessor

2017-04-27T09:29:23Z

I seriously doubt that Python's standard print()-statement would close STDOUT in between writes! :-)

But the incremental workflow is apparently the problem. I wrote minimal (pass-through) preprocessors in bash:

#!/bin/bash
cat > /Users/paulrein/dummy.log  
cat /Users/paulrein/dummy.log

and Python:

#!/usr/local/opt/python3/Frameworks/Python.framework/Versions/3.6/bin/python3

import sys  
import logging  
logging.basicConfig(filename='/Users/paulrein/Minimal.log', level=logging.DEBUG)

if __name__ == "__main__":  
    with sys.stdin as f:
        SOURCE = [line for line in f]
    logging.info("Starting")
    logging.debug("Source: "+str(SOURCE))
    logging.info("STDIN is "+str(sys.stdin))
    logging.info("STDOUT is "+str(sys.stdout))
    LINES = iter(SOURCE)

    logging.info("Dumping the output to STDOUT")
    for line in LINES:
        sys.stdout.write(line)
    logging.info("Finished!")
    sys.exit(0)

and they have no problem with any documents. I tried to bolt that kind of buffering onto my existing script but it still do not want to cooperate so I guess I have something else wrong. Now that I know that the minimal buffered script works, I can rewrite the processing logic into it and see if I can get it to work.

Thanks for your help!

Problems with custom preprocessor

2017-04-28T16:02:31Z

You should just be able to replace your current print/write statements
with appending to an output array, then writing out line by line as you
have or just doing a join/dump at the end. I'm not well versed in
Python, but that's how the scripts I've written in it have worked.

-Brett