More J Pandoc Syntax HighLighting

Syntax highlighting is essential for blogging program code. Many blog hosts recognize this and provide tools for highlighting programming languages. WordPress.com (this host) has a nifty highlighting tool that handles dozens of mainstream programming languages. Unfortunately, one of my favorite programming languages, J, (yes it’s a single letter name), is way out of the mainstream and is not supported.

There are a few ways to deal with this problem.

  1. Eschew J highlighting.
  2. Upgrade1 your WordPress.com subscription and install custom syntax highlighters that can handle arbitrary language definitions.
  3. Find another blog host that freely supports custom highlighters.
  4. Roll your own or customize an existing highlighter.

A few years ago I went with the fourth option and hacked the superb open-source tool pandoc. The grim details are described in this blog post. My hack produced a customized version of pandoc with J highlighting. I still use my hacked version and I’d probably stick with it if current pandoc versions had not introduced must-have features like converting Jupyter notebooks to Markdown, PDF, LaTeX and HTML. Jupyter is my default thinking-things-through programming environment. I’ve even taken to blogging with Jupyter notebooks. If you write and explain code you owe it to yourself to give Jupyter a try.

Unwilling to eschew J highlighting or forgo Jupyter I was on the verge of re-hacking pandoc when I read the current pandoc (version 2.9.1.1) documentation and saw that J is now officially supported by pandoc. You can verify this with the shell commands.

pandoc --version
pandoc --list-highlight-languages

The pandoc developers made my day! I felt like Wayne meeting a rock star.

Highlighting J is now a simple matter of placing J code in markdown blocks like:

  ~~~~ { .j }
      ... code code code ...
  ~~~~

and issuing shell commands like:

pandoc --highlight-style tango --metadata title="J test" -s jpdh.md -o jpdh.html

The previous command generated the HTML of this post which I pasted into the WordPress.com Classic Editor. Not only do I get J code highlighting on the cheap I also get footnotes which, for god freaking sakes,2 are not supported by the new WordPress block editor for low budget blogs.

The source markdown used for this post is available here – enjoy!


NB. Some J code I am currently using to test TAB
NB. delimited text files before loading them with SSIS.

NB. read TAB delimited table files as symbols - see long document
readtd2s=:[: s:@<;._2&> (9{a.) ,&.>~ [: <;._2 [: (] , ((10{a.)"_ = {:) }. (10{a.)"_) (13{a.) -.~ 1!:1&(]`<@.(32&>@(3!:0)))

tdkeytest=:4 : 0

NB.*tdkeytest v-- test natural key columns  of TAB delimited text
NB. files.
NB.
NB. Many of the raw tables of the ETL process depend on  compound
NB. primary keys. This verb applies a basic  test of primary  key
NB. columns. Passing this test  makes it very  likely  the  table
NB. will load  without key constraint  violations.  Failures  are
NB. still possible depending  on how  text  data is converted  to
NB. other  datatypes. Failure of this test indicates  a very high
NB. chance of key constraint violations.
NB.
NB. dyad:  il =. blclColnames tdkeytest clFile
NB.
NB.   f0=. 'C:\temp\dailytsv\raw_ST_BU.txt'
NB.   k0=. ;:'BuId XMLFileDate'
NB.   k0 tdkeytest f0
NB.
NB.   f1=. 'C:\temp\dailytsv\raw_ST_Item.txt'
NB.   k1=. ;:'BuId ItemId XMLFileDate'
NB.   k1 tdkeytest f1

NB. first row is header
h=. 0{d=. readtd2s y

NB. key column positions
'header key column(s) missing' assert -.(#h) e. p=. h i. s: x

c=. #d=. }. d
b=. ~:p {"1 d

NB. columns unique, rowcnt, nonunique rowcnt
if. r=. c = +/b do.
  r , c , 0
else.
  NB. there are duplicates show some sorted duplicate keys
  k=. p {"1 d
  d=. d {~ I. k e. k #~ -.b
  d=. (/: p {"1 d) { d
  b=. ~:p {"1 d
  m=. +/b
  smoutput (":m),' duplicate key blocks'
  n=. DUPSHOW <. m
  smoutput 'first ',(":n),' duplicate row key blocks'
  smoutput (<p { h) ,&.> n {. ,. b <;.1 p {"1 d
  r , c , #d
end.
)

  1. The pay more option is always available.
  2. WordPress.com is beginning to remind me of Adobe. Stop taking away longstanding features when upgrading!

Extracting SQL code from SSIS dtsx packages with Python lxml

Lately, I’ve been refactoring a sprawling SSIS (SQL Server Integration Services) package that ineffectually wrestles with large XML files. In this programmer’s opinion using SSIS for heavy-duty XML parsing is geeky self-abuse so I’ve opted to replace an eye-ball straining[1] SSIS package with half a dozen, “as simple as possible but no simpler”, Python scripts. If the Python is fast enough for production great! If not the scripts will serve as a clear model[2] for something faster.

I’m only refactoring[3] part of a larger ETL process so whatever I do it must mesh with the rest of the mess.

So where is the rest of the SSIS mess?

SSIS’s visual editor does a wonderful job of hiding the damn code!

This is a problem!

If only there was a simple way to troll through large sprawling SSIS spider-webby packages and extract the good bits. Fortunately, Python’s XML parsing tools can be easily applied to SSIS dtsx files. SSIS dtsx files are XML files. The following code snippets illustrate how to hack these files.

First import the required Python modules. lxml is not always included in Python distributions. Use the pip or conda tools to install this module.

# imports
import os
from lxml import etree

Set an output directory. I’m running on a Windows machine. If you’re on a Mac or Linux machine adjust the path.

# set sql output directory
sql_out = r"C:\temp\dtsxsql"
if not os.path.isdir(sql_out):
    os.makedirs(sql_out)

Point to the dtsx package you want to extract code from.

# dtsx files
dtsx_path = r'C:\Users\john\AnacondaProjects\testfolder\bixml'
ssis_dtsx = dtsx_path + r'\ParseXML.dtsx'
ssis_dtsx

Read and parse the SSIS package.

tree = etree.parse(ssis_dtsx)
root = tree.getroot()
root.tag
'{www.microsoft.com/SqlServer/Dts}Executable'

lxml renders XML namespace tags like <DTS:Executable as {www.microsoft.com/SqlServer/Dts}Executable. The following
shows all the transformed element tags in the dtsx package.

# collect unique element tags in dtsx
ele_set = set()
for ele in root.xpath(".//*"):
    ele_set.add(ele.tag)    
print(ele_set)
print(len(ele_set))
{'InnerObject', '{www.microsoft.com/SqlServer/Dts}PrecedenceConstraint', '{www.microsoft.com/SqlServer/Dts}ObjectData', '{www.microsoft.com/SqlServer/Dts}PackageParameter', '{www.microsoft.com/SqlServer/Dts}LogProvider', '{http://schemas.xmlsoap.org/soap/envelope/}Envelope', '{www.microsoft.com/SqlServer/Dts}Executable', 'ExpressionTask', '{http://schemas.xmlsoap.org/soap/envelope/}Body', '{www.microsoft.com/SqlServer/Dts}Variable', '{www.microsoft.com/SqlServer/Dts}ForEachVariableMapping', '{www.microsoft.com/SqlServer/Dts}PrecedenceConstraints', '{www.microsoft.com/SqlServer/Dts}SelectedLogProviders', 'ForEachFileEnumeratorProperties', '{www.microsoft.com/SqlServer/Dts}ForEachVariableMappings', '{www.microsoft.com/SqlServer/Dts}ForEachEnumerator', '{www.microsoft.com/SqlServer/Dts}SelectedLogProvider', '{www.microsoft.com/SqlServer/Dts}DesignTimeProperties', '{www.microsoft.com/SqlServer/Dts}LogProviders', '{www.microsoft.com/SqlServer/Dts}LoggingOptions', '{www.microsoft.com/SqlServer/Dts}Variables', 'FEFEProperty', 'FileSystemData', 'ProjectItem', '{www.microsoft.com/SqlServer/Dts}Property', '{http://www.w3.org/2001/XMLSchema}anyType', '{www.microsoft.com/SqlServer/Dts}Executables', '{www.microsoft.com/sqlserver/dts/tasks/sqltask}ParameterBinding', '{www.microsoft.com/sqlserver/dts/tasks/sqltask}ResultBinding', '{www.microsoft.com/SqlServer/Dts}VariableValue', 'FEEADO', 'BinaryItem', '{www.microsoft.com/SqlServer/Dts}PackageParameters', '{www.microsoft.com/SqlServer/Dts}PropertyExpression', '{www.microsoft.com/sqlserver/dts/tasks/sqltask}SqlTaskData', 'ScriptProject'}
36

Using transformed element tags of interest blast over the dtsx and suck out the bits of interest.

# extract sql code in source statements and write to *.sql files 
total_bytes = 0
package_name = root.attrib['{www.microsoft.com/SqlServer/Dts}ObjectName'].replace(" ","")
for cnt, ele in enumerate(root.xpath(".//*")):
    if ele.tag == "{www.microsoft.com/SqlServer/Dts}Executable":
        attr = ele.attrib
        for child0 in ele:
            if child0.tag == "{www.microsoft.com/SqlServer/Dts}ObjectData":
                for child1 in child0:
                    sql_comment = attr["{www.microsoft.com/SqlServer/Dts}ObjectName"].strip()
                    if child1.tag == "{www.microsoft.com/sqlserver/dts/tasks/sqltask}SqlTaskData":
                        dtsx_sql = child1.attrib["{www.microsoft.com/sqlserver/dts/tasks/sqltask}SqlStatementSource"]
                        dtsx_sql = "-- " + sql_comment + "\n" + dtsx_sql
                        sql_file = sql_out + "\\" + package_name + str(cnt) + ".sql"
                        total_bytes += len(dtsx_sql)
                        print((len(dtsx_sql), sql_comment, sql_file))
                        with open(sql_file, "w") as file:
                            file.write(dtsx_sql)
print(('total sql code bytes',total_bytes))
   (2817, 'Add Record to ZipProcessLog', 'C:\\temp\\dtsxsql\\2_ParseXML225.sql')
    (48, 'Dummy SQL - End Loop for ZipFileName', 'C:\\temp\\dtsxsql\\2_ParseXML268.sql')
    (1327, 'Add Record to XMLProcessLog', 'C:\\temp\\dtsxsql\\2_ParseXML293.sql')
    (546, 'Delete Prior Loads in XMLProcessLog Table for Looped XML', 'C:\\temp\\dtsxsql\\2_ParseXML304.sql')
    (759, 'Delete Prior Loads to Node Table for Looped XML', 'C:\\temp\\dtsxsql\\2_ParseXML312.sql')
    (48, 'Dummy SQL - End Loop for XMLFileName', 'C:\\temp\\dtsxsql\\2_ParseXML320.sql')
    (1862, 'Set Variable DeletePriorImportNodeFlag', 'C:\\temp\\dtsxsql\\2_ParseXML356.sql')
    (55, 'Shred XML to Node Table', 'C:\\temp\\dtsxsql\\2_ParseXML365.sql')
    (1011, 'Update LoadEndDatetime and XMLRecordCount in XMLProcessLog', 'C:\\temp\\dtsxsql\\2_ParseXML371.sql')
    (1060, 'Update LoadEndDatetime and XMLRecordCount in XMLProcessLog - Shred Failure', 'C:\\temp\\dtsxsql\\2_ParseXML382.sql')
    (675, 'Load object VariablesList (Nodes to process for each XML File Category)', 'C:\\temp\\dtsxsql\\2_ParseXML412.sql')
    (1175, 'Set Variable ZipProcessFlag - Has Zip Had A Prior Successful Run', 'C:\\temp\\dtsxsql\\2_ParseXML461.sql')
    (224, 'Set ZipProcessed Status (End Zip)', 'C:\\temp\\dtsxsql\\2_ParseXML474.sql')
    (238, 'Set ZipProcessed Status (Zip Already Processed)', 'C:\\temp\\dtsxsql\\2_ParseXML480.sql')
    (231, 'Set ZipProcessing Status (Zip Starting)', 'C:\\temp\\dtsxsql\\2_ParseXML486.sql')
    (609, 'Update LoadEndDatetime in ZipProcessLog', 'C:\\temp\\dtsxsql\\2_ParseXML506.sql')
    (613, 'Update ZipLog UnzipCompletedDateTime Equal to GETDATE', 'C:\\temp\\dtsxsql\\2_ParseXML514.sql')
    (1610, 'Update ZipProcessLog ExtractedFileCount', 'C:\\temp\\dtsxsql\\2_ParseXML522.sql')
    ('total sql code bytes', 14908)

The code snippets in this post are available in this Jupyter notebook: Extracting SQL code from SSIS dtsx packages with Python lxml. Download and tweak for your dtsx nightmare!



  1. I frequently run into SSIS packages that cannot be viewed on 4K monitors when fully zoomed out.
  2. Python’s readability is a major asset when disentangling mess-ware.
  3. Yes, I’ve railed about the word “refactoring” in the past but I’ve moved on and so should you. “A foolish consistency is the hobgoblin of little minds.”

NumPy another Iverson Ghost

During my recent SmugMug API and Python adventures I was haunted by an Iverson ghost: NumPy

An Iverson ghost is an embedding of APL like array programming features in nonAPL languages and tools.

You would be surprised at how often Iverson ghosts appear. Whenever programmers are challenged with processing large numeric arrays they rediscover bits of APL. Often they’re unaware of the rich heritage of array processing languages but in NumPy's case, they indirectly acknowledged the debt. In Numerical Python the authors wrote:

“The languages which were used to guide the development of NumPy include the infamous APL family of languages, Basis, MATLAB, FORTRAN, S and S+, and others.”

I consider “infamous” an upgrade from “a mistake carried through to perfection.”

Not only do developers frequently conjure up Iverson ghosts. They also invariably turn into little apostles of array programming that won’t shut up about how cutting down on all those goddamn loops clarifies and simplifies algorithms. How learning to think about operating on entire arrays, versus one dinky number at a time, frees the mind. Why it’s almost as if array programming is a tool of thought.

Where have I heard this before?

Ahh, I’ve got it, when I first encountered APL almost fifty years ago.

Yes, I am an old programmer, a fossil, a living relic. My brain is a putrid pool of punky programming languages. Python is just the latest in a longish line of languages. Some people collect stamps. I collect programming languages. And, just like stamp collectors have favorite stamps, I find some programming languages more attractive than others. For example, I recognize the undeniable utility of C/C++, for many tasks they are the only serious options, yet as useful and pervasive as C/C++ are they have never tickled my fancy. The notation is ugly! Yeah, I said it; suck on it C people. Similarly, the world’s most commonly used programming language JavaScript is equally ugly. Again, JavaScript is so damn useful that programmers put up with its many warts. Some have even made a few bucks writing books about its meager good parts.

I have similar inflammatory opinions about other widely used languages. The one that is making me miserable now is SQL, particularly Microsoft’s variant T-SQL. On purely aesthetic grounds I find well-formed SQL queries less appalling than your average C pointer fest. Core SQL is fairly elegant but the macro programming features that have grown up around it are depraved. I feel dirty when forced to use them which is just about every other day.

At the end of my programming day, I want to look on something that is beautiful. I don’t particularly care about how useful a chunk of code is or how much money it might make, or what silly little business problem it solves. If the damn code is ugly I don’t want to see it.

People keep rediscovering array programming, best described in Ken Iverson’s 1962 book A Programming Language, for two basic reasons:

  1. It’s an efficient way to handle an important class of problems.
  2. It’s a step away from the ugly and back towards the beautiful.

Both of these reasons manifest in NumPy‘s resounding success in the Python world.

As usual, efficiency led the way. The authors of Numerical Python note:

Why are these extensions needed? The core reason is a very prosaic one, and that is that manipulating a set of a million numbers in Python with the standard data structures such as lists, tuples or classes is much too slow and uses too much space.

Faced with a “does not compute” situation you can either try something else or fix what you have. The Python people fixed Python with NumPy. Pythonistas reluctantly embraced NumPy but quickly went apostolic! Now books like Elegant SciPy and the entire SciPy toolset that been built on NumPy take it for granted.

Is there anything in NumPy for programmers that have been drinking the array processing Kool-Aid for decades? The answer is yes! J programmers, in particular, are in for a treat with the new Python3 addon that’s been released with the latest J 8.07 beta. This addon directly supports NumPy arrays making it easy to swap data in and out of the J/Python environments. It’s one of those best of both worlds things.

The following NumPy examples are from the SciPy.org NumPy quick start tutorial. For each NumPy statement, I have provided a J equivalent. J is a descendant of APL. It was largely designed by the same man: Ken Iverson. A scumbag lawyer or greedy patent troll might consider suing NumPy‘s creators after looking at these examples. APL’s influence is obvious. Fortunately, Ken Iverson was more interested in promoting good ideas that profiting from them. I suspect he would be flattered that APL has mutated and colonized strange new worlds and I think even zealous Pythonistas will agree that Python is a delightfully strange world.

Some Numpy and J examples

Selected Examples from https://docs.scipy.org/doc/numpy-dev/user/quickstart.html Output has been suppressed here. For a more detailed look at these examples browse the Jupyter notebook:  NumPy and J Make Sweet Array Love.

Creating simple arrays

 
 # numpy
 a = np.arange(15).reshape(3, 5)
 
 NB. J
 a =. 3 5 $ i. 15

 # numpy
 a = np.array([2,3,4])
 
 NB. J
 a =. 2 3 4
 
 # numpy
 b = np.array([(1.5,2,3), (4,5,6)])
 
 NB. J
 b =. 1.5 2 3 ,: 4 5 6

 # numpy
 c = np.array( [ [1,2], [3,4] ], dtype=complex )
 
 NB. J
 j. 1 2 ,: 3 4
 
 # numpy
 np.zeros( (3,4) )
 
 NB. J
 3 4 $ 0
 
 # numpy - allocates array with whatever is in memory
 np.empty( (2,3) )
 
 NB. J - uses fill - safer but slower than numpy's trust memory method
 2 3 $ 0.0001 

Basic operations

 
 # numpy
 a = np.array( [20,30,40,50] )
 b = np.arange( 4 )
 c = a - b
 
 NB. J
 a =. 20 30 40 50
 b =. i. 4
 c =. a - b
 
 # numpy - uses previously defined (b)
 b ** 2
 
 NB. J
 b ^ 2

 # numpy - uses previously defined (a)
 10 * np.sin(a)
 
 NB. J
 10 * 1 o. a
 
 # numpy - booleans are True and False
 a < 35
 
 NB. J - booleans are 1 and 0
 a < 35

Array processing

 
 # numpy
 a = np.array( [[1,1], [0,1]] )
 b = np.array( [[2,0], [3,4]] )
 # elementwise product
 a * b

 NB. J
 a =. 1 1 ,: 0 1
 b =. 2 0 ,: 3 4
 a * b

 # numpy - matrix product
 np.dot(a, b)

 NB. J - matrix product
 a +/ . * b  
 
 # numpy - uniform pseudo random
 a = np.random.random( (2,3) )
 
 NB. J - uniform pseudo random
 a =. ? 2 3 $ 0
 
 # numpy - sum all array elements - implicit ravel
 a.sum(a)
 
 NB. J - sum all array elements - explicit ravel
 +/ , a
 
 # numpy
 b = np.arange(12).reshape(3,4)
 # sum of each column
 b.sum(axis=0)
 # min of each row
 b.min(axis=1)
 # cumulative sum along each row
 b.cumsum(axis=1)
 # transpose
 b.T     

 NB. J 
 b =. 3 4 $ i. 12
 NB. sum of each column
 +/ b
 NB. min of each row
 <./"1 b
 NB. cumulative sum along each row
 +/\"0 1 b
 NB. transpose
 |: b

Indexing and slicing

 
 # numpy 
 a = np.arange(10) ** 3 
 a[2]
 a[2:5]
 a[ : :-1]   # reversal

 NB. J
 a =. (i. 10) ^ 3
 2 { a
 (2 + i. 3) { a
 |. a

SWAG a J/EXCEL/GIT Personal Cash Flow Forecasting Mob

While browsing in a favorite bookstore with my son, I spotted a display of horoscope themed Christmas tree ornaments. The ornaments were glass balls embossed with golden birth signs like Aquarius, Gemini, Cancer, et cetera, and a descriptive phrase that “summed up” the character of people born under a sign. Below my birth sign golden text declared, “Imaginative and Suspicious.”

I said to my son, “I hate it when astrological rubbish is right.”

I am imaginative and suspicious; it’s a curse. When it comes to money my “suspicious dial” is permanently set on eleven. I assume everyone is out to cheat and defraud me until there is overwhelming evidence to the contrary. Paranoia is generally crippling but when it comes to cold hard cash it’s a sound retention strategy.

Prompted by an eminent life move, I found myself in need of a cash flow forecasting tool. Normal people deal with forecasting problems by buying standard finance programs or cranking up spreadsheets; imaginative and suspicious people roll their own.

SWAG

SWAG, (Silly Wild Ass Guess), is a hybrid J/EXCEL/GIT mob1 that meets my eccentric needs. I wanted a tool that:

  1. Abstracted away accounting noise.
  2. Was general and flexible.
  3. Used highly portable, durable, and version control friendly inputs and outputs.
  4. Reflected what ordinary people, not tax accountants, actually do with money.
  5. Is open source and unencumbered by parasitic software patents.

Amazingly, my short list of no-brainer requirements eliminates most standard finance programs. Time to code!

SWAG Inputs

The bulk of SWAG is a JOD generated self-contained J script. You can peruse the script here. SWAG inputs and outputs are brain-dead simple TAB delimited text tables. Inputs consist of monthly, null-free, numeric time series tables, scenario tables, and name cross-reference tables. Outputs are simple, null-free, numeric time series tables. Input and output time series tables have identical formats.

A few examples will make this clear. The following is a typical SWAG input and output time series table.

 Date        E0      E1       E2      E3  E4     E5  E6  E7  E8  EC  EF  Etotal   I0       I1  I2  I3  I4  I5  IC  Itotal   R0         R1        R2  R3  Rtotal     D0  D1  D2  D3  D4  Dtotal  BB       NW         U0         U1  U2  U3
 2015-09-01  912.00  1650.00  100.00  0   50.00  0   0   0   0   0   0   2712.00  4800.00  0   0   0   0   0   0   4800.00  130000.00  25000.00  0   0   155000.00  0   0   0   0   0   0       2088.00  157088.00  155000.00  0   0   0 
 2015-10-01  912.00  1656.88  100.00  0   50.00  0   0   0   0   0   0   2718.88  4806.00  0   0   0   0   0   0   4806.00  130054.17  25062.50  0   0   155116.67  0   0   0   0   0   0       2087.13  159291.79  0          0   0   0 
 2015-11-01  912.00  1663.78  100.00  0   50.00  0   0   0   0   0   0   2725.78  4812.01  0   0   0   0   0   0   4812.01  130054.17  25062.50  0   0   155116.67  0   0   0   0   0   0       2086.23  161378.02  0          0   0   0 
 2015-12-01  912.00  1670.71  100.00  0   50.00  0   0   0   0   0   0   2732.71  4818.02  0   0   0   0   0   0   4818.02  130054.17  25062.50  0   0   155116.67  0   0   0   0   0   0       2085.31  163463.33  0          0   0   0 
 2016-01-01  912.00  1677.67  100.00  0   50.00  0   0   0   0   0   0   2739.67  4824.05  0   0   0   0   0   0   4824.05  130054.17  25062.50  0   0   155116.67  0   0   0   0   0   0       2084.37  165547.70  0          0   0   0 
 2016-02-01  912.00  1684.66  100.00  0   50.00  0   0   0   0   0   0   2746.66  4830.08  0   0   0   0   0   0   4830.08  130054.17  25062.50  0   0   155116.67  0   0   0   0   0   0       2083.41  167631.12  0          0   0   0 
 2016-03-01  912.00  1691.68  100.00  0   50.00  0   0   0   0   0   0   2753.68  4836.11  0   0   0   0   0   0   4836.11  130054.17  25062.50  0   0   155116.67  0   0   0   0   0   0       2082.43  169713.55  0          0   0   0 

The first header line is a simple list of names. The first name “Date” heads a column of first of month dates in YYYY-MM-DD format. The SWAG clock has month resolution and dates are the only nonnumeric items. Names beginning with “E” like E0, E1, …, are aggregated expenses. Names beginning with “I” like I0, I1, I2 … are income totals. “R” names are reserves: basically savings, investments, equity and so forth. “D” names are various debts. BB is basic period balance, NW is period net worth and “U” names are utility series. Utility series facilitate calculations. Remaining names are self-explanatory totals. Be aware that this table has been formatted for this blog. Examples of raw input and output tables can be found here.

The next ingredient in the SWAG stew is what many call a scenario. A scenario is a collection of prospective assumptions and actions. In one scenario you buy a Mercedes and assume interest rates remain low. In another, you take the bus and rates explode. When forecasting I evaluate five basic scenarios, grim, pessimistic, expected, optimistic, and exuberant. Being a negative Debbie Downer type I rarely invest time in exuberant scenarios. I concentrate on grim and pessimistic scenarios because once you are mentally prepared for the worst anything better feels like a lottery win.

The following is a typical SWAG scenario table. Scenario tables, like time series tables, are simple TAB delimited text files.

 Name         Scenario On Group       Value  OnDate     OffDate    Method   MethodArguments                                                Description                                                                     
 reservetotal s0          assumptions 0      2015-09-01 2015-10-01 assume   RSavings=. 0.5 [ RInvest=. 3 [ REquity=. 3 [ ROther=. 1        annual nominal percent reserve growth or decline during period                  
 car          s0                      50     2015-09-01 2035-08-01 history                                                                 annualized car maintenance until first death                                    
 house        s0                      912    2015-09-01 2016-08-01 history  BackPeriods=.1                                                 current rent until move                                                         
 insurance    s0                      100    2015-09-01 2035-08-01 history                                                                 car insurance                                                                   
 living       s0                      1650   2015-09-01 2044-01-01 history  YearInflate=.5                                                 normal monthly living expenses                                                  
 salary       s0                      4800   2015-09-01 2016-08-01 history  BackPeriods=.4 [ YearInflate=.1.5                              maintain net monthly income until move                                          
 reservetotal s0                      25000  2015-09-01 2015-10-01 reserve  Initial=.1 [ RInvest=. 1                                       stock value at model start                                                      
 reservetotal s0                      130000 2015-09-01 2015-10-01 reserve  Initial=.1                                                     savings at model start                                                          
 salary       s0                      4200   2016-08-01 2023-07-01 history  BackPeriods=.4 [ YearInflate=.1.5                              reduced net income after move until retirement                                  
 house        s0          move        2000   2016-08-01 2016-09-01 history                                                                 moving expenses                                                                 
 house        s0                      100    2016-08-01 2044-01-01 history                                                                 incidental housing expenses after move                                          
 house        s0                      100    2016-08-01 2044-01-01 history                                                                 home owners association payments                                                
 house        s0                      150    2016-08-01 2044-01-01 history                                                                 property taxes                                                                  
 reservetotal s0          buy house   110000 2016-08-01 2016-09-01 reserve  Initial=.1 [ REquity=. 1                                       down payment added to house equity initial setting prevents double spend        
 loan         s0          buy house   150000 2016-08-01 2023-02-01 borrow   Interest=. 4.5 [ YearTerm=. 30 [ DHouse=.1  [ LoanEquity=.1    30 year mortgage rate on house until inheritance                                
 reservetotal s0          buy house   110000 2016-08-01 2016-09-01 spend                                                                   down payment on house from savings                                              
 annuities    s0                      250    2018-07-01 2035-08-01 history                                                                 monthly retirement and other annuity payments end date is unknown               
 annuities    s0                      50     2018-07-01 2035-08-01 history                                                                 any government pension payments                                                 
 reservetotal s0          assumptions 0      2020-01-01 2044-01-01 assume   RSavings=. -2.5 [ RInvest=. -5.0 [ REquity=. -5.0 [ ROther=. 2 market tanks government introduces negative interest                            
 loan         s0          buy car     10000  2020-07-01 2025-07-01 borrow   Interest=. 5 [ YearTerm=. 5 [ DCar=.1                          pay balance of car at 5% for 5 years                                            
 reservetotal s0          buy car     7000   2020-07-01 2020-08-01 spend                                                                   car down payment from savings                                                   
 reservetotal s0          inherit     180000 2023-01-01 2023-02-01 reserve  Initial=.1                                                     inheritance to savings                                                          
 reservetotal s0          buy house   150000 2023-03-01 2023-04-01 transfer Fee=. 1500 [ DHouse=.1                                         pay off remaining mortgage balance after inheritance fee is closing cost        
 salary       s0                      1400   2023-07-01 2035-08-01 history                                                                 estimated social security payments spread over expected life                    
 insurance    s0                      700    2023-07-01 2024-12-01 history                                                                 medical insurance in the gap between retirement and spouse medicare eligibility 
 annuities    s0                      100    2024-12-01 2044-01-01 history                                                                 any retirement pension payments to spouse                                       
 annuities    s0                      100    2035-08-01 2044-01-01 history                                                                 any us social security survivor benefit after first death                       

Again the first header row is a simple list of names. Most scenario names are self-explanatory but four OnDate, OffDate, Method, and MethodArguments merit some explanation. SWAG series methods assume, history, reserve, transfer, borrow, and spend are modeled on what people typically do with cash.

  1. assume sets expected interest rates and other global assumptions for a given time period. SWAG series methods operate over a well-defined time period. The period is defined by OnDate and OffDate.
  2. history looks at past periods and estimates a numeric value that is projected into the future. Currently, history computes simple means but the underlying code can use arbitrary time series verbs.
  3. reserve manages savings, investments, equity and other cash-like instruments.
  4. borrow borrows money and sets future loan payments. borrow supports simple amortization loans but is also capable of reading an arbitrary payment schedule that can be used for exotic2 loans.
  5. transfer moves money between reserves, debts, expenses and income series.
  6. spend does just what you expect.

SWAG series methods adjust all the series affected by the method. As you might expect SWAG arguments methods are detailed. MethodArguments uses a restricted J syntax to set SWAG arguments. Argument order does not matter but only supported names are allowed. Many examples of SWAG MethodArguments can be found in the EXCEL spreadsheet tp.xlsx. I use EXCEL as a scenario editor. By setting EXCEL filters, you can manage many scenarios.

The final SWAG input is a name cross-reference table. It is another TAB delimited text file that defines SWAG names. You can inspect a typical cross-reference table here.

Running SWAG

To run SWAG you:

  1. Prepare input files.
  2. Start J, any front-end jconsole, JQT or JHS will do, and load the Swag script.
  3. Execute RunTheNumbers.
  4. Open the EXCEL spreadsheet swag.xlsx, click on the data ribbon and then press the “Refresh All” button.

Let’s work through the steps.

Prepare Input Files

By far the most difficult step is the first. Here you review your financial status which means checking bank balances, stock values, loan balances and so on. Depending on your holdings this could take anywhere from minutes to hours. I call this updating actuals. Not only is updating actuals the most difficult and time-consuming step it is also the most valuable. Money that is not closely watched leaks away.

I store my actuals in a simple tabbed spreadsheet. Each tab maintains an image of a text file. I enter my data and then cut and paste the sheets into a text editor where I apply final tweaks and then save the sheets as TAB delimited text files.

Monthly income, expenses and debts are easy to update but some of my holdings do not offer monthly statements. The verb RawReservesFromLast in Swag.ijs fills in missing months with the last known values. When I’m finished preparing input files I’m left with four actual TAB delimited files, RawIncome.txt, RawExpenses.txt, RawReserves.txt, and RawDebts.txt. You can inspect example actual files here.

Start J and load the Swag script.

The SWAG script is relatively self-contained. It can be run in any J session that loads the standard J profile. Load Swag with the standard load utility.

 load 'c:/pd/fd/swag/swag.ijs'

Here SWAG is loaded in JHS.

jhsswag

Execute RunTheNumbers

RunTheNumbers sets the SWAG configuration, loads scenarios, copies actuals to each scenario, and then evaluates each scenario. Scenarios are numbered. I use positive numbers for “production” scenarios and negative numbers for test scenarios. It sounds more complicated that it is. This is all you have to do to execute RunTheNumbers

 RunTheNumbers 0 1 2 3 4 

The code is simple and shows what’s going on.

RunTheNumbers=:3 : 0

NB.*RunTheNumbers v-- compute all scenarios on list (y).
NB.
NB. monad:  blclFiles =. RunTheNumbers ilScenarios
NB.
NB.   RunTheNumbers 0 1 2 3 4

NB. parameters sheet is the last config sheet
ModelConfiguration_Swag_=:MainConfiguration_Swag_
parms=. ".;{:LoadConfig 0
scfx=. ScenarioPrefix

ac=. toHOST fmttd ActualSheet 0
ac write TABSheetPath,'MainActuals',SheetExt

sf=. 0$a:
for_sn. y do.
  ac write TABSheetPath,scfx,(":sn),'Actuals',SheetExt
  sf=. sf , parms Swag sn [ LoadSheets sn
end.

sf 
)

RunTheNumbers writes a pair of TAB delimited forecast and statistics files for each scenario it evaluates.

Open swag.xlsx and press “Refresh All”

The spreadsheet swag.xlsx loads SWAG TAB delimited text files and plots results.3 I plot monthly cash flow, estimated net worth and debt/equity for each scenario. The following is a typical cash flow plot. It estimates mean monthly cash balance over the scenario time range.

meanbalance

The polynomial displayed on the graph is an estimated trend line. Things are looking bleak.

Here’s a typical net worth plot.

networth

In this happy scenario, we die broke and leave a giant bill for the government.4

So far SWAG has met my basic needs and forced me to pay more attention to the proverbial bottom line. As I use the system I will fix bugs, refine rough spots, and add strictly necessary features. Feel free to use or modify SWAG for your own purposes. If you find SWAG useful please leave a note on this blog or follow the SWAG repository on GitHub.


  1. What do you call dis-integrated collections of programs that you use to solve problems? Declaring such dog piles “systems” demeans the word “system” and gives the impression that everything has been planned. This is not how I roll. “Mob” is far more appropriate. It conveys a proper sense of disorder and danger.
  2. When borrowing money you should always plan on paying it all back. Insist on a complete iron clad repayment schedule. If such a schedule cannot be provided run like hell or prepare for the thick end of a baseball bat to be rammed up your financial ass.
  3. It may be necessary to adjust file paths on the EXCEL DATA ribbon to load SWAG TAB delimited text files.
  4. They can see me in Hell to collect.

Turning JOD Dump Script Tricks

Have you ever wondered how extremely prolific bloggers do it? How is it possible to crank out thousands of blog entries per year without creating a giant stinking pile of mediocre doo-doo? Like most deep medium mysteries it’s not very deep and there are no mysteries. The spewers, people who post like teenage girls tweet, use two basic strategies:

  1. Multiple authors: The heroic image of the lone blogger waging holy war against a sea of Internet idiocy is largely a myth. Many popular prolific blogs are the work of many hands. The editor at Analyze the Data not the Drivel eschews this tactic. Apparently he’s an incontinent and argumentative prima donna that sane people steer clear of.
  2. Content recycling: In elementary school this was called copying. Now that we’re all grown up we use terms like, “excerpting”, “abstracting”, and my favorite “re-purposing.” The basic idea is simple. Take something you’ve written elsewhere and repackage it as something new. Hey, all the cool kids are doing it!

The following is a slightly edited new appendix I have just added to the JOD manual. I am working to properly publish the JOD manual mostly so I can say that I’ve written a legitimate, albeit strange and queer, book.

I created this post by running the \LaTeX code of the manual appendix through the excellent utility pandoc, tweaking the resulting markdown, and then using pandoc again to generate html for this blog. pandoc is a great “re-purposing” tool!  

Finally, re-purposing is not entirely cynical. The act of moving material from one medium to another exposes problems. I found a few editing errors while creating this post that eluded my \LaTeX eyes. If you find more this is your chance to tell me what a moron I am.

Turning JOD Dump Script Tricks

Dump script generation is my favorite JOD feature. Dump scripts serialize JOD dictionaries; they are mainly used to back up dictionaries and interact with version control systems. However, dump scripts are general J scripts and can do much more! Maintaining a stable of healthy JOD dictionaries is easier if you can turn a few dump script tricks.1

  1. Flattening reference paths: Open JOD dictionaries define a reference path. For example, if you open the following dictionaries:
       NB. open four dictionaries
       od ;:'smugdev smug image utils'
    +-+-----------------------+-------+----+-----+-----+
    |1|opened (ro/ro/ro/ro) ->|smugdev|smug|image|utils|
    +-+-----------------------+-------+----+-----+-----+

    the reference path is /smugdev/smug/image/utils.

    When objects are retrieved each dictionary on the path is searched in reference path order. If there are no compelling reasons to maintain separate dictionaries you can improve JOD retrieval performance and simplify dictionary maintenance by flattening all or part of the path.

    To flatten the reference path do:

       NB. reopen the first three dictionaries on the path
       od ;:'smugdev smug image' [ 3 od ''
    +-+--------------------+-------+----+-----+
    |1|opened (ro/ro/ro) ->|smugdev|smug|image|
    +-+--------------------+-------+----+-----+
    
       NB. dump to a temporary file (df)
       df=: {: showpass make jpath '~jodtemp/smugflat.ijs'
    +-+---------------------------+-----------------------+
    |1|object(s) on path dumped ->|c:/jodtemp/smugflat.ijs|
    +-+---------------------------+-----------------------+
    
       NB. create a new flat dictionary
       newd 'smugflat';jpath '~jodtemp/smugflat' [ 3 od ''
    +-+---------------------+--------+--------------------+
    |1|dictionary created ->|smugflat|c:/jodtemp/smugflat/|
    +-+---------------------+--------+--------------------+
    
       NB. open the flat dictionary and (utils)
       od ;:'smugflat utils'
    +-+-----------------+--------+-----+
    |1|opened (rw/ro) ->|smugflat|utils|
    +-+-----------------+--------+-----+
    
       NB. reload dump script ... output not shown ...  
       0!:0 df

    The collapsed path /smugflat/utils will return the same objects as the longer path. It is important to understand that the collapsed dictionary smugflat does not necessarily contain the same objects found in the three original dictionaries smugdev, smug and image. If objects with the same name exist in the original dictionaries only the first one found will be in the collapsed dictionary.

  2. Merging dictionaries: If two dictionaries contain no overlapping objects it might make sense to merge them. This is easily achieved with dump scripts. To merge two or more dictionaries do:
       NB. open and dump first dictionary
       od 'dict0' [ 3 od ''
    +-+--------------+-----+
    |1|opened (rw) ->|dict0|
    +-+--------------+-----+
       df0=: {: showpass make jpath '~jodtemp/dict0.ijs'
    +-+---------------------------+--------------------+
    |1|object(s) on path dumped ->|c:/jodtemp/dict0.ijs|
    +-+---------------------------+--------------------+
    
       NB. open and dump second dictionary
       od 'dict1' [ 3 od ''
    +-+--------------+-----+
    |1|opened (rw) ->|dict1|
    +-+--------------+-----+
       df1=: {: showpass make jpath '~jodtemp/dict1.ijs'
    +-+---------------------------+--------------------+
    |1|object(s) on path dumped ->|c:/jodtemp/dict1.ijs|
    +-+---------------------------+--------------------+
    
       NB. create new merge dictionary
       newd 'mergedict';jpath '~jodtemp/mergedict' [ 3 od ''
    +-+---------------------+---------+---------------------+
    |1|dictionary created ->|mergedict|c:/jodtemp/mergedict/|
    +-+---------------------+---------+---------------------+
    
       NB. open merge dictionary and run dump scripts
       od 'mergedict'
    +-+--------------+---------+
    |1|opened (rw) ->|mergedict|
    +-+--------------+---------+
    
       NB. reload dump scripts ... output not shown ...  
       0!:0 df0  
       0!:0 df1

    Be careful when merging dictionaries. If there are common objects the last object loaded is the one retained in the merged dictionary.

  3. Updating master file parameters: When a new parameter is added to jodparms.ijs it will not be available in existing dictionaries. With dump scripts you can rebuild existing dictionaries and update parameters. To rebuild a dictionary with new or custom parameters do:
       NB. save current dictionary registrations
       (toHOST ; 1 { 5 od '') write_ajod_ jpath '~temp/jodregister.ijs'
    
       NB. open dictionary requiring parameter update 
       od 'dict0' [ 3 od ''
    +-+--------------+-----+
    |1|opened (rw) ->|dict0|
    +-+--------------+-----+
    
       NB. dump dictionary and close
       df=: {: showpass make jpath '~jodtemp/dict0.ijs'
    +-+---------------------------+--------------------+
    |1|object(s) on path dumped ->|c:/jodtemp/dict0.ijs|
    +-+---------------------------+--------------------+
    
       3 od ''
    +-+---------+-----+
    |1|closed ->|dict0|
    +-+---------+-----+
    
       NB. erase master file and JOD object id file
       ferase jpath '~addons/general/jod/jmaster.ijf'
    1
       ferase jpath '~addons/general/jod/jod.ijn'
    1
    
       NB. recycle JOD - this recreates (jmaster.ijf) and (jod.ijn) 
       NB. using the new dictionary parameters defined in (jodparms.ijs)   
       (jodon , jodoff) 1
    1 1
    
       NB. re-register dictionaries
       load jpath '~temp/jodregister.ijs'
    
       NB. create a new dictionary - it will have the new parameters
       newd 'dict0new';jpath '~jodtemp/dict0new' [ 3 od ''
    +-+---------------------+---------+-------------------+
    |1|dictionary created ->|dict0new|c:/jodtemp/dict0new/|
    +-+---------------------+---------+-------------------+
    
       od 'dict0new'
    +-+--------------+--------+
    |1|opened (rw) ->|dict0new|
    +-+--------------+--------+
    
       NB. reload dump script ... output not shown ...
       0!:0 df  

    Before executing complex dump script procedures back up your JOD dictionary folders and play with dump scripts on test dictionaries. Dump scripts are essential JOD dictionary maintenance tools but like most powerful tools they must be used with care.


  1. Spicing up one’s rhetoric with a double entendre like “turning tricks” may be construed as a microaggression. The point of colored language is to memorably make a point. You are unlikely to forget turning dump script tricks.