An Accounting Companion in R Markdown

Foreword

The practice of double entry bookkeeping originated in renaissance Italy as a simple method for merchants to keep track of their daily transactions. The merchant who invented this concept sought to track the inflows of money into his business, while also tracking the outflows of jewelry that he sold. To do so, he devised this concept of credits and debits, and the basis of modern double-entry bookkeeping. Double entry bookkeeping soon spread from Italy to the rest of the world, as Italy was a great center of trade, renowned the world over for their innovations in commerce, science, industry, and the arts. While most people associate the renaissance with art, architecture, and sculpture- the foundations of modern civics and finance also took root during this time in history. From that humble beginning as as practice of a few Venice merchants, Accounting has become a globally important practice and spawned numerous entire professions; professions in which millions of finance professionals ply their trade every day.

While the concepts of double entry bookkeeping took hold - new issues arose - and banks and financiers started to form more formal bodies of rules to dictate exactly how these practices should take place. Specifics of how to record transactions, when to record them, and how to “balance the books”, became essential to the global practice of accounting. That leads us to modern times and the rules and formulas we will discuss in this book. Modern accounting practices in the US follow a rule set called the Generally Accepted Accounting Principles (GAAP) - a set of rules and advice from the Financial Accounting Standards Board (FASB). Although GAAP offers some black and white guidance, it also has gray areas, limits and loopholes that may be exploited.

This guide translates the key elements and concepts from the Accountant’s toolbox into reusable R code, and and presents textbook-style guide. In this guide, each chapter contains a different general subject matter, and contains a variety of formulas that can be called as a function. These functions may be called in a series, or nested into other formulas, to generate useful calculations and visualizations.

This guide is intended for finance professionals, statisticians, data scientists, math teachers, students*, or anyone else who needs to generate, visualize, apply, or solve accounting problems.

Load packages

This R markdown file will use several packages as defined below. This code will load the packages only of they are not already installed. These packages will be loaded into the local scope via the library functions.

if(!require(pracma)) install.packages("pracma", repos = "http://cran.us.r-project.org")
if(!require(dplyr)) install.packages("dplyr", repos = "http://cran.us.r-project.org")
if(!require(ggplot2)) install.packages("ggplot2", repos = "http://cran.us.r-project.org")
if(!require(ggforce)) install.packages("ggforce", repos = "http://cran.us.r-project.org")
if(!require(gtools)) install.packages("gtools", repos = "http://cran.us.r-project.org")
if(!require(matlib)) install.packages("matlib", repos = "http://cran.us.r-project.org")
if(!require(MASS)) install.packages("MASS", repos = "http://cran.us.r-project.org")
if(!require(kableExtra)) install.packages("kableExtra", repos = "http://cran.us.r-project.org")
if(!require(RcppAlgos)) install.packages("RcppAlgos", repos = "http://cran.us.r-project.org")
if(!require(latex2exp)) install.packages("latex2exp", repos = "http://cran.us.r-project.org")
if(!require(openxlsx)) install.packages("openxlsx", repos = "http://cran.us.r-project.org")

library(pracma)
library(dplyr)
library(ggplot2)
library(ggforce)
library(gtools)
library(matlib)
library(MASS)
library(kableExtra)
library(RcppAlgos)
library(latex2exp)
library(openxlsx)

options(scipen=999)

Fundamentals of Bookkeeping

Since accounting deals primarily in the transfer of money, it is important for any system of tracking to start with some data collection rules and standards. When we make a sale we are collecting income - when we pay a bill we are incurring expenses.

Data needs to be collected about each transaction, so we can keep track of what the income or expense was for. R provides a powerful platform for accounting, as it combines data management, visualization, and advanced mathematics into one platform.

So, to start out - we will consider what kinds of data you might need to collect for a bookkeeping system. Some of the obvious details that come to mind for a transaction are things like the date and what was sold. Some less obvious things you want to track in an accounting environment - are things like the account code, transaction type (debit or credit) and other codes like a purchase order, invoice number, or other categorical / filing details.

Defining Transactions

A transaction is an exchange of goods or value. Some types of transactions may result in nothing actually changing hands - rather they may simply be transfers between different parts of your account ledger. But the most obvious example of a transaction is when you sell something. In this case, you are moving money out of your assets (inventory) and into your income (sales).

Debits & Credits

The idea of debits and credits can be confusing at first. For some accounts, a debit is positive and a credit is negative, for other accounts a debit is negative and a credit is positive. This depends on which side of the balance sheet the account belongs to.

Assets

Assets are the amounts & things you Own.

Liabilities

Liabilities are the amounts & things you Owe.

Income

Income is the inflow of money into your business.

Expense

Expenses are the things you pay out of your business.

Equity

Equity is your ownership or net value of your business.

Revenue

Revenue is your profits on income, minus expenses on the things you sold.

Recording Transactions

To record a transaction in R, you would probably want to start with an R data frame object. A data frame or data table, is the foundation of a more complex structure of relational tables that might track transactions and help you balance your ledger. SOme of the details we might want to record for a transaction include the date and amount of the transaction, what type of transaction it is, and who we transacted with. Lets see how you might do that using R code:

Creating an Accounts Table

The idea behind double entry bookkeeping is to have a list of accounts, that taken together should always balance out to 0. This means for every dollar you take in, you place a dollar into one account, and take a dollar out of another account.

accounts_df <- data.frame()
accounts_df

Creating a General Ledger (GL) table

A general ledger table should collect all the key details about each transaction.

ledger_df <- data.frame()
ledger_df

Creating a Cash Flow (CF) statement table

A cash flow statement should record movements into and out of cash accounts

cashflow_df <- data.frame()
cashflow_df

Creating an Income statement

An income statement should record profit and expense related to goods sold

income_df <- data.frame()
income_df

Add an Account

Now we will create a few functions to start adding lines to our account and ledger tables. We will need accounts to be set up before enteriong any transactions - so we will start there.

add_account <- function( 
  category,
  sub_category,
  acct_name,
  acct_code,
  acct_open_date,
  acct_balance,
  acct_direction,
  acct_permanent){

  new_row <- data.frame(
    category=category,
    sub_category=sub_category,
    acct_name=acct_name,
    acct_code=acct_code,
    acct_open_date=acct_open_date,
    acct_balance=acct_balance,
    acct_direction=acct_direction,
    acct_permanent=acct_permanent
  )

  accounts_df <<- accounts_df %>% bind_rows(new_row)
  
}

Create a Chart of Accounts

OK, we now have a mechanism (function) to add data to our accounts table. Lets start by adding some key accounts with 0 balances. This will create our initial “chart of accounts”

#set the open_date for all of these accounts
open_date <- as.Date("2020-1-1")

#add_account(category,sub_category,acct_name,acct_code,acct_open_date,acct_balance)

add_account("Assets","Cash","Cash on Hand",1111,open_date,0,1,1)
add_account("Assets","Cash","Checking Account",1112,open_date,0,1,1)
add_account("Assets","Cash","Money Market",1113,open_date,0,1,1)

add_account("Assets","Accounts receivable","Credit Receipts",1211,open_date,0,1,1)

add_account("Assets","Inventory","Raw Materials",1411,open_date,0,1,1)
add_account("Assets","Inventory","Raw Packaging",1412,open_date,0,1,1)
add_account("Assets","Inventory","Finished Products",1420,open_date,0,1,1)

add_account("Assets","Land & Buildings","Land",1611,open_date,0,1,1)
add_account("Assets","Land & Buildings","Buildings",1612,open_date,0,1,1)

add_account("Liabilities","Accounts payable","Accounts payable",2111,open_date,0,0,1)
add_account("Liabilities","Debt","Long-Term Debt",2112,open_date,0,0,1)

add_account("Equity","Owners Equity","Capital",3111,open_date,0,0,1)
add_account("Equity","Owners Equity","Retained Earnings",3112,open_date,0,0,1)

add_account("Revenues","Sales","Sales Revenue",4011,open_date,0,0,0)

add_account("Expenses","Operating Expenses","Wages",4111,open_date,0,1,0)
add_account("Expenses","Operating Expenses","Taxes",4112,open_date,0,1,0)
add_account("Expenses","Occupancy","Rent",4113,open_date,0,1,0)
add_account("Expenses","Advertising","Advertising",4114,open_date,0,1,0)
add_account("Expenses","Communications","Phone",4115,open_date,0,1,0)

add_account("Expenses","Inventory Expense","Raw Materials",5111,open_date,0,1,0)
add_account("Expenses","Inventory Expense","Packaging",5112,open_date,0,1,0)
add_account("Expenses","Inventory Expense","Finished Products",5113,open_date,0,1,0)

Lets take a look at our chart of accounts now.

accounts_df

Add a Transaction to the Ledger

Now we will create a few functions to start adding lines to our account and ledger tables. We will need these accounts to be set up before entering any transactions - so we’ll start there.

#create a function to add transactions
add_transaction <- function( 
  trans_amount=NULL,
  acct_lookup=NULL,
  trans_date=NULL,
  budget_code=NULL,
  po_number=NULL,
  invoice_number=NULL,
  payment_date=NULL,
  payment_method=NULL){
  
  validated = TRUE
  
  #make sure at least the account code and $ amount are set
  if(is.null(trans_amount)){
     msg_confirm <<- paste('Non-0 Transaction amount is required.')
     validated = FALSE
  }
  if(is.null(acct_lookup)){
     msg_confirm <<- paste('Acct code required.')
    validated = FALSE
  }
 if(!validated){
   success <- FALSE
   return(success)
 }
  
  #set date to now if not provided
 if(is.null(trans_date)){
    trans_date = Sys.time()
 }
  #set budget_code to blank if not provided
 if(is.null(trans_date)){
    budget_code = ""
 }
  #set po_number to blank if not provided
 if(is.null(po_number)){
    po_number = ""
 }
  #set invoice_number to blank if not provided
 if(is.null(invoice_number)){
    invoice_number = ""
 }
  #set payment_date to blank if not provided
 if(is.null(payment_date)){
    payment_date = ""
 }
  #set payment_method to blank if not provided
 if(is.null(payment_method)){
    payment_method = ""
 }
  
   #get the name of this account to stamp on the ledger
  acct_name <- accounts_df %>% filter(acct_code==acct_lookup) %>% pull(acct_name)
 
    #get the direction of this account to determine the transaction type
  acct_direction <- accounts_df %>% filter(acct_code==acct_lookup) %>% pull(acct_direction)
 
 if(acct_direction ==1){
   #a positive number is a debit
   if(trans_amount>0){ 
     trans_type = "Debit"
   }else{ 
     trans_type = "Credit"
     }
 }else{
  #a positive number is a credit
     if(trans_amount>0){ 
     trans_type = "Credit"
   }else{ 
     trans_type = "Debit"
     }
 }
  
  abs_trans_amount <- abs(trans_amount)
  
  #create a row for the transaction
  new_row <- data.frame(
  trans_date=trans_date,
  trans_amount=abs_trans_amount,
  trans_type=trans_type,
  acct_name=acct_name,
  acct_code=acct_lookup,
  budget_code=budget_code,
  po_number=po_number,
  invoice_number=invoice_number,
  payment_date=payment_date,
  payment_method=payment_method
  )

  #get the old balance
  curr_balance <- accounts_df %>% filter(acct_code==acct_lookup) %>% pull(acct_balance)
    
  
  #add this transaction amount
  new_balance <- curr_balance + trans_amount
  
  #set a confirmation message in the environment
  msg_confirm <<- paste('Balance was updated due to a ',trans_type,' - from ',curr_balance,' to ',new_balance,' on account:',acct_name,sep='')
  
  #update the account with the new balance
  accounts_df <<- accounts_df %>% mutate(acct_balance = ifelse(acct_code == acct_lookup,new_balance,acct_balance))

  #return the ledger
  ledger_df <<- ledger_df %>% bind_rows(new_row)
  
  success <- TRUE
  return(success)
}

Funding the business with the owners initial investment

Now we will add the 10K initial cash investment by adding 10K to cash (on the left side of the balance sheet) and 10K to Equity on the right side.

#set up one transaction of $10000 to cash
#the add_transaction function will figure out if its a debit or credit, based on if I pas a positive or negative number.
trans_amount <- 10000
acct_code <- 1111
trans_date <- Sys.time()
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""

success_A <- add_transaction(
  trans_amount,
  acct_code,
  trans_date,
  budget_code,
  po_number,
  invoice_number,
  payment_date,
  payment_method)

If the operation succeeded, we should get a message that it did.

if(success_A){
  #the transaction recorded correctly
  msg_confirm
}else{
  #the transaction recorded incorrectly
  msg_confirm
}
## [1] "Balance was updated due to a Debit - from 0 to 10000 on account:Cash on Hand"

Now we’ll examine the accounts and ledger.

ledger_df
accounts_df

Looks good - as expected, the Cash on Hand balance is now $10,000.

So we now have exactly one transaction in our ledger. But double entry bookkeeping means that you should really have 2 entries (at least) for each transaction. This is how you will maintain your balance sheet - by adding a debit in one account, and an equal credit in another (usually).

So lets go ahead and add another transaction in the same amount - this time adding 10K in owner equity - this is what the business “owes” it’s owner in terms of stock or capital:

if(success_A){
  trans_amount <- 10000
  acct_code <- 3111
  trans_date <- as.Date("2020-1-1")
  budget_code <- ""
  po_number <- ""
  invoice_number <- ""
  payment_date <- ""
  payment_method <- ""
  
  success_B <- add_transaction(
    trans_amount,
    acct_code,
    trans_date,
    budget_code,
    po_number,
    invoice_number,
    payment_date,
    payment_method)
  
  
}

If the operation succeeded, we should get a message that it did.

if(success_B){
  msg_confirm
}else{
  msg_confirm
}
## [1] "Balance was updated due to a Credit - from 0 to 10000 on account:Capital"
simple_plot <- ggplot() + 
  geom_col(data=accounts_df, aes(x=acct_name,y=acct_balance,fill=category)) + 
   ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)') + theme(axis.text.x = element_text(angle = 90))
 
 simple_plot

Buying your initial inventory of raw materials

Now we need to take some money out of our cash on hand, and buys some materials we want to sell Since the owner has funded the initial $1000 investment - we just need to make some transfers now from our cash account, into our other accounts we are tracking. We will take $500 in cash and go to our distributor, where we buy $350 in raw materials, and $150 in packaging.

#set up one transaction of $10000 to cash
#the add_transaction function will figure out if its a debit or credit, based on if I pas a positive or negative number.
trans_amount <- -500
acct_code <- 1111
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""

success_A <- add_transaction(
  trans_amount,
  acct_code,
  trans_date,
  budget_code,
  po_number,
  invoice_number,
  payment_date,
  payment_method)

If the operation succeeded, we should get a message that it did.

if(success_A){
  msg_confirm
}else{
  msg_confirm
}
## [1] "Balance was updated due to a Credit - from 10000 to 9500 on account:Cash on Hand"

Now we’ll examine the accounts and ledger.

ledger_df
accounts_df
simple_plot <- ggplot() + 
  geom_col(data=accounts_df, aes(x=acct_name,y=acct_balance,fill=category)) + 
   ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)') + theme(axis.text.x = element_text(angle = 90))
 
 simple_plot

Now that we have spent some of our cash on hand, we need to recall what we spent it on. In this case, were purchasing some raw materials ($350) and packaging ($150) for our business.

if(success_A){
  trans_amount <- 350
  acct_code <- 1411
  trans_date <- as.Date("2020-1-3")
  budget_code <- ""
  po_number <- ""
  invoice_number <- ""
  payment_date <- ""
  payment_method <- ""
  
  success_B <- add_transaction(
    trans_amount,
    acct_code,
    trans_date,
    budget_code,
    po_number,
    invoice_number,
    payment_date,
    payment_method)
  
   trans_amount <- 150
  acct_code <- 1412
  trans_date <- as.Date("2020-1-3")
  budget_code <- ""
  po_number <- ""
  invoice_number <- ""
  payment_date <- ""
  payment_method <- ""
  
   success_C <- add_transaction(
    trans_amount,
    acct_code,
    trans_date,
    budget_code,
    po_number,
    invoice_number,
    payment_date,
    payment_method)
}

If the operation succeeded, we should get a message that it did.

if(success_B){
  msg_confirm
}else{
  msg_confirm
}
## [1] "Balance was updated due to a Debit - from 0 to 150 on account:Raw Packaging"

Then we take another $300, go down to the furniture store to buy some tables, a market umbrella, and some other fixtures we will use to sell this stuff.

Now we’ll examine the accounts and ledger.

ledger_df
accounts_df
simple_plot <- ggplot() + 
  geom_col(data=accounts_df, aes(x=acct_name,y=acct_balance,fill=category)) + 
   ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)') + theme(axis.text.x = element_text(angle = 90))
 
 simple_plot

#use half teh raw materials...
trans_amount <- -175
acct_code <- 1411
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""

success_A <- add_transaction(
  trans_amount,
  acct_code,
  trans_date,
  budget_code,
  po_number,
  invoice_number,
  payment_date,
  payment_method)

#and half teh raw packaging...
trans_amount <- -75
acct_code <- 1412
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""

success_B <- add_transaction(
  trans_amount,
  acct_code,
  trans_date,
  budget_code,
  po_number,
  invoice_number,
  payment_date,
  payment_method)

#to make $250 (at cost) worth of finished products
trans_amount <- 250
acct_code <- 1420
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""

success_C <- add_transaction(
  trans_amount,
  acct_code,
  trans_date,
  budget_code,
  po_number,
  invoice_number,
  payment_date,
  payment_method)

So now we have two transactions in the ledger that balance each other. Lets take it a step further and define some common types of transactions and which accounts they affect., These rules will become built-in, multi-step ledger entries to ensure that when entering new transactions, we do so in as few steps as possible, and always using a consistent process for each series of transactions.

GAAP accounting as some formal elements that can help us work out what those rules should be. The income statement is a great way to figure out what portions of each sale are attributable to what parts of the balance sheet. In this example, the cash is entered for the cash on hand account, inventory removed from the inventory area, and the net profit is entered into equity.

Making some initial inventory & selling finished products

Once we have acquired raw goods, we’ll need to make them into finished products for sale, and sell them!

To keep track of this series of steps, we need to make some entries in the ledger. The entries that will be needed include debits or credits to the following accounts:

Cash on Hand (1111) Sales Revenue (4011) Raw Packaging Raw Materials Retained Earnings

This is a complex series of steps. Equations are needed to take some inputs and produce a correct set of entries in the ledger. This calls for a reusable function. The function will take as inputs:

Sale Amount
Cost of Goods Sold
Operating Expense
Less Taxes

And from these inputs it will calculate some other key factors including:

Gross profit (Sales - COGS)
Operating Profit (Gross profit - Operating Expense)
Net Income (Operating profit - Less Taxes)

Since this is a guide about accounting, and not manufacturing per-se - we are not going to worry too much about the details of the manufacturing process just yet. For now, we will just estimate some of the production costs that go into our products; later in this guide, we will look at how to more accurately track the production costs through - you guessed it - double-entry bookkeeping.

add_cash_income <- function( 
  sales,
  cogs,
  operating_expense,
  less_taxes){

   gross_profit <- sales - cogs
   operating_profit <- gross_profit-operating_expense
   net_income <- operating_profit - less_taxes
  
  #create a new data row with all the input & calculated values
  new_row <- data.frame(
    trans_date=Sys.time(),
    sales=sales,
    cogs=cogs,
    gross_profit=gross_profit,
    operating_expense=operating_expense,
    operating_profit=operating_profit,
    less_taxes=less_taxes,
    net_income=net_income
  )
  #add the row to income df
  income_df <<- income_df %>% bind_rows(new_row)
  
  #get the old balance
  curr_balance <- accounts_df %>% filter(acct_code==1111) %>% pull(acct_balance)
  
  #create a new row for the cashflow table
  new_row <- data.frame(
    trans_date=Sys.time(),
    begin_cash=curr_balance,
    collected_cash=sales,
    paid_cash=0,
    end_cash=curr_balance+sales
  )
  cashflow_df <<- cashflow_df %>% bind_rows(new_row)
  
  #get the time
  time_stamp <- Sys.time()

  #add the sales as a positive
  add_transaction(
  sales,
  1111,
  time_stamp,
  "",
  "",
  "",
  "",
  "")
  
  #add the sales as a positive (credit) to revenue
  add_transaction(
  sales,
  4011,
  time_stamp,
  "",
  "",
  "",
  "",
  "")
  
  #add the cost of good sold as a negative to finished inventory
   add_transaction(
  -cogs,
    1420,
  time_stamp,
  "",
  "",
  "",
  "",
  "")
   
   
  #add the cost of good sold as a positive to inventory expenses for raw materials
   add_transaction(
  cogs,
    5111,
  time_stamp,
  "",
  "",
  "",
  "",
  "")
   
    #add the net income to owner equity retained earnings
   add_transaction(
  net_income,
    3112,
  time_stamp,
  "",
  "",
  "",
  "",
  "")
}

We have the function for making cash sales - now it’s time to use it.

Scenario: Dealers showcase

In this scenario will go to a local dealers showcase and make some sales of our products. We will consider operating expense as 0 and tax as 0 since these are wholesale sales, and the maker does not yet take a salary - so we will only count the cost of the goods sold considering only raw materials.

Luckily, our dealers showcase was successful - we sold all $250 worth of product (at cost) that we made.

sale_amount <- 60
  cogs <- 30
  operating_expense <- 0
  less_taxes <- 0
  
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 40
  cogs <- 20
  operating_expense <- 0
  less_taxes <- 0
  
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 120
  cogs <- 75
  operating_expense <- 0
  less_taxes <- 0
  
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 110
  cogs <- 65
  operating_expense <- 0
  less_taxes <- 0
  
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 90
  cogs <- 60
  operating_expense <- 0
  less_taxes <- 0
  
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
cashflow_df
income_df
ledger_df
accounts_df

The cash flow statement

The cash flow statement tracks cash transactions only.

The Balance Sheet

The balance sheet is a critical tool for managing the inflows and outflows of money between the various accounts. This is where the double-entry concept really comes in - for each transaction on “one side” of the balance sheet - there should be a matching transaction on th e:other side” - somewhere. Knowing where the transaction should appear is really what GAAP rules help define.

So in the above example - I have traded $10,000 in cash for $10,000 in capital. My Equity has increased on the one side, and my assets have increased on the other.

There are a couple key formulas we need to introduce now:

  1. Assets = Liabilities + Equity
  2. Revenue - Expenses = Profit

So in this case, the balance sheet should be balanced - we should have 10,000 in assets and an equal amount in Equity.

So lets check out our balance sheet now by summing up each of the accounts we have defined above.

balance_sheet <- function(){
  accts_total <<- accounts_df %>% group_by(category,acct_direction) %>% summarize(curr_balance=sum(acct_balance))

  assets_balance <<- accts_total %>% filter(category == "Assets") %>% pull(curr_balance)
  liabilities_balance <<- accts_total %>% filter(category == "Liabilities") %>% pull(curr_balance)
  equity_balance <<- accts_total %>% filter(category == "Equity") %>% pull(curr_balance)
  revenue_balance <<- accts_total %>% filter(category == "Revenues") %>% pull(curr_balance)
  expenses_balance <<- accts_total %>% filter(category == "Expenses") %>% pull(curr_balance)
  
  are_permanent_accounts_balanced <- assets_balance==liabilities_balance+equity_balance
  are_permanent_accounts_balanced
  }

balance_sheet()
## [1] TRUE
accts_total
simple_plot <- ggplot() + 
  geom_col(data=accts_total, aes(x=category,y=curr_balance,fill=category)) + 
   ggtitle('Balance Sheet') + labs(x='Account Category',y='Balance ($)')
 
 simple_plot

Are our accounts balanced?

Here we can check if the permanent* accounts are balanced:

(Assets, Liabilities, Equity and Permanent accounts. Revenue and expenses are “temporary” accounts which reset each year.)

are_permanent_accounts_balanced <- assets_balance == liabilities_balance+equity_balance
are_permanent_accounts_balanced
## [1] TRUE

So far, so good - we have balanced our accounts.

cashflow_df
profit_balance <- revenue_balance-expenses_balance
profit_balance
## [1] 170

And we now have made profit_balance in profit.

cashflow_df$id <- as.numeric(row.names(cashflow_df))

simple_plot <- cashflow_df %>% ggplot(aes(x=id,y=end_cash)) + 
  geom_point() + 
  geom_line(aes(x=id,y=end_cash)) + 
  ggtitle('Cash Flow Summary') + labs(x='Transaction',y='Cash')
 
 simple_plot

We have explored how to make entries into a general ledger, income statement, cash flow statement, and balance sheet.

Now, where do you go from here? Perhaps you would like to have the cost of goods sold (COGS) be calculated for you? Let see how that might work by exploring the inventory account more thoroughly.

Tracking Profits through rigorous data management

To make this all more tangible, lets image we have a soap company. We make soaps, lotions, fizzy bath bombs, and other personal care products. Now lets list out some recipes for our most popular products. This will require setting up some relational data frames that will cross reference each other. The tables should follow relational rules and strive to approximate third normal form.

Units of Measure Table

This table keeps the unit of measure of each ingredient source. Some are measured in oz. dry weight, some in fl. oz. liquid volume, and others are measured by the drop. Keeping the units of measure will help in case we ever need to convert between like measures, say, oz vs lb. or liters vs floz.

unit_df <- data.frame()
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=1,unit='oz',unit_name='ounce',unit_type='Dry Weight',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=2,unit='floz',unit_name='fluid ounce',unit_name='Liquid Volume',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=3,unit='drop',unit_name='droplet',unit_name='Liquid Volume',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=4,unit='lb',unit_name='pound',unit_name='Dry Weight',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=5,unit='ea',unit_name='each',unit_name='Each',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=6,unit='g',unit_name='gram',unit_name='Dry Weight',system='metric'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=7,unit='l',unit_name='liter',unit_name='Liquid Volume',system='metric'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=8,unit='gal',unit_name='gallon',unit_name='Liquid Volume',system='metric'))

Recipe Table

This table stores some recipes we will analyze for cost & profit.

recipe_df <- data.frame()
recipe_df <- recipe_df %>% bind_rows(data.frame(recipe_id=1,prod_code='BB-LAV',recipe_name='bath bomb - lavendar',batch_size=20,price_each=4,price_batch=80))
recipe_df <- recipe_df %>% bind_rows(data.frame(recipe_id=2,prod_code='BB-ROSE',recipe_name='bath bomb - rose',batch_size=16,price_each=4.50,price_batch=72))
recipe_df <- recipe_df %>% bind_rows(data.frame(recipe_id=3,prod_code='BB-ORNG',recipe_name='bath bomb - citrus',batch_size=24,price_each=3.75,price_batch=90))

Raw Ingredients & Materials Table

This table keeps track of our raw ingredients

ingred_df <- data.frame()
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=1,ing_code='EP-SALT',ingredient_name='epsom salt',unit_id=1))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=2,ing_code='B-SODA',ingredient_name='baking soda',unit_id=1))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=3,ing_code='COCO-OIL',ingredient_name='coconut oil',unit_id=2))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=4,ing_code='ESS-OIL-LAV',ingredient_name='essential oil - lavender',unit_id=3))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=5,ing_code='ESS-OIL-ROSE',ingredient_name='essential oil - rose',unit_id=3))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=6,ing_code='ESS-OIL-ORNG',ingredient_name='essential oil - citrus',unit_id=3))

Recipe Parts Table

This table keeps track of what raw ingredients are needed for each recipe, in what amounts.

recipeparts_df <- data.frame()
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=1,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=2,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=3,recipe_amount=6))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=4,recipe_amount=40))

recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=1,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=2,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=3,recipe_amount=6))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=5,recipe_amount=50))

recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=1,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=2,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=3,recipe_amount=6))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=6,recipe_amount=30))

Lookup a Recipe

We can look up any of our recipes by ID using this:

recipe_df %>% 
  left_join(recipeparts_df,by='recipe_id') %>% 
  left_join(ingred_df,by='ing_id') %>% 
  left_join(unit_df,by='unit_id') %>% 
  filter(recipe_id==1) %>%
  dplyr::select(prod_code,recipe_name,recipe_amount,unit,ing_code,ingredient_name, price_each, price_batch, batch_size)

OK, so here is a recipe we can use to calculate cost of goods sold (COGS). To do so, we need to know what the cost of all the individual ingredients are in this recipe.

Ingredient Cost Table

This table will store the purchase price and unit size of each source package, a portion of which will be used by each recipe.

ingred_cost_df <- data.frame()
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=1,cost_amount=12,unit_size=48))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=2,cost_amount=9,unit_size=32))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=3,cost_amount=5,unit_size=24))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=4,cost_amount=4,unit_size=160))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=5,cost_amount=5,unit_size=160))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=6,cost_amount=6,unit_size=160))

Calculate costs for each recipe batch

OK, lets see how our recipe stacks up now that we have our material costs factored in. Drumroll please…

recipe_lookup <- 1

recipe_cost <- recipe_df %>% 
  left_join(recipeparts_df,by='recipe_id') %>% 
  left_join(ingred_df,by='ing_id') %>% 
  left_join(unit_df,by='unit_id') %>% 
  left_join(ingred_cost_df,by='ing_id') %>% 
  filter(recipe_id==recipe_lookup) %>% 
  mutate(cost_ing = cost_amount * recipe_amount / unit_size, cost_ing_per = (cost_amount * (recipe_amount / unit_size))/batch_size,) %>% 
  dplyr::select(prod_code,recipe_name,recipe_amount,unit,ing_code,ingredient_name,cost_ing,batch_size,cost_ing_per) 


recipe_cogs <- recipe_cost %>% summarize(batch_cost=sum(cost_ing)) %>% pull(batch_cost)
recipe_value <- recipe_df %>% filter(recipe_id==recipe_lookup) %>% pull(price_batch)
gross_profit <- recipe_value-recipe_cogs

profit_margin <- round((gross_profit / recipe_cogs) * 100,2)

paste('The cost of good sold on this batch is',recipe_cogs,' for a gross profit of',gross_profit,' and a profit margin on this recipe of:',profit_margin,'% ')
## [1] "The cost of good sold on this batch is 10.75  for a gross profit of 69.25  and a profit margin on this recipe of: 644.19 % "

Not bad, our recipe yields a healthy profit margin of profit_margin. So as we sell product we can now closely track the COGS for use with our rule-based account entry function.

recipe_list_df <- recipe_df %>% 
  left_join(recipeparts_df,by='recipe_id') %>% 
  left_join(ingred_df,by='ing_id') %>% 
  left_join(unit_df,by='unit_id') %>% 
  left_join(ingred_cost_df,by='ing_id') %>% 
  mutate(cost_ing = cost_amount * recipe_amount / unit_size, cost_ing_per = (cost_amount * (recipe_amount / unit_size))/batch_size,) %>% 
  dplyr::select(recipe_id,prod_code,recipe_name,recipe_amount,unit,ing_code,ingredient_name,cost_ing,batch_size,cost_ing_per,price_batch,price_each,batch_size) 
recipe_list_df
recipe_summary_df <- recipe_list_df %>% group_by(recipe_id,recipe_name,price_batch,price_each,batch_size) %>% summarize(recipe_cost=sum(cost_ing)) %>% ungroup() %>% mutate(each_cost=recipe_cost/batch_size)
recipe_summary_df

By using precise measurements of our materials, we can determine the correct price for each item we sell, based on our desired profit margin. IN this example, we can see that slight differences in the amount of materials used, and the cost of those materials, may impact several factors like the yield (quantity produced) of each batch, the price of each item and the price of the batch. This can also be helpful for planning for the amount of materials we need to order to produce a certain amount of end products.

Create a Customer Table

It is a good idea to store some basic information about your customers, if available. Personally identifiable information is subject to legal protection. Be sure to secure and protect this information if you collect and store it!

cust_number <- 1001
customer_df <- data.frame()

customer_df <- customer_df %>% bind_rows(data.frame(
  cust_id=cust_number,
  fname='Joe',
  lname='Smith',
  email='[email protected]',
  phone='17072914948',
  optin=1
  ))

Create an Invoice

Set some meta data for a new invoice. Depending on whether its a sale or purchase, it may have some fields provided or omitted.

invoice_number <- 100001

invoice_header_df <- data.frame(
  budget_code='A1',
  po_number='',
  cust_id=cust_number,
  invoice_number=invoice_number,
  payment_date=Sys.time(),
  payment_method='Cash')

Add Line Items to Invoice

An invoice should list each individual item purchased and the quantity of that item, its individual cost, and the extension cost of the entire line item.

invoice_df <- data.frame()
invoice_df <- invoice_df %>% bind_rows(data.frame(invoice_number=invoice_number,recipe_id=1,unit_quantity=6))
invoice_df <- invoice_df %>% bind_rows(data.frame(invoice_number=invoice_number,recipe_id=2,unit_quantity=8))
invoice_df <- invoice_df %>% bind_rows(data.frame(invoice_number=invoice_number,recipe_id=3,unit_quantity=10))
invoice_df

Calculate goods costs from line items on an invoice

You can figure out the costs for each line item using some filters and summation functions.

#figure out the cost by taking the per item cost and multiplying it by quantity
order_cost <- invoice_df %>% 
  left_join(recipe_summary_df,by='recipe_id') %>% 
  mutate(line_price = unit_quantity * price_each, line_cost = unit_quantity * each_cost) %>% 
  dplyr::select(unit_quantity,recipe_name,price_each,line_price, line_cost, each_cost, recipe_cost)
order_cost
#get the order total
order_total <- order_cost %>% summarize(order_total=sum(line_price)) %>% pull(order_total)

#get the cost total
order_cogs <- order_cost %>% summarize(order_cost=sum(line_cost)) %>% pull(order_cost)

#determine gross profit
order_profit <- order_total - order_cogs

#round to 2 decimal places
order_cogs <- round(order_cogs,2)
order_profit <- round(order_profit,2)

#Calculate profit margin
profit_margin <- round((order_profit / order_cogs) * 100,2)

#write to screen
paste('The cost of goods sold on this order is $',order_cogs,' for a gross profit of $',order_profit,' and a profit margin on this recipe of:',profit_margin,'% ',sep="")
## [1] "The cost of goods sold on this order is $13.41 for a gross profit of $84.09 and a profit margin on this recipe of:627.07% "

OK… this invoice has a few lines for order_total dollars and we can see exactly what amount of this is cost: order_cogs

#order lines to customer
order_invoice <- order_cost %>% dplyr::select(unit_quantity,recipe_name,price_each,line_price)
  order_invoice
get_next_id <- function(){
  next_id <- invoice_header_df %>% arrange(-invoice_number) %>% top_n(1) %>% pull(invoice_number)
  next_id <- next_id + 1
  next_id
}
get_next_id()
## [1] 100002

Now, recall our add_cash_sale function which need both an amount in cash collected, and a cost of goods sold. Now we can more easily find this for each invoice we create.

#pull the sale amount and cost off the invoice generator
sale_amount <- order_total
  cogs <- order_cogs
  operating_expense <- 0
  less_taxes <- 0
  
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
balance_sheet()
## [1] TRUE
accts_total
simple_plot <- ggplot() + 
  geom_col(data=accts_total, aes(x=category,y=curr_balance,fill=category)) + 
   ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)')
 
 simple_plot

Create a Workers (Employees) Table

It’s important to store some basic information about your workers.

worker_id_seed <- 1001
worker_df <- data.frame()

get_next_worker_id <- function(){
  next_worker_id <- worker_df %>% arrange(-worker_id) %>% slice(1) %>% pull(worker_id)
  next_worker_id <- next_worker_id + 1
  next_worker_id
}


worker_number <- worker_id_seed
worker_df <- worker_df %>% bind_rows(data.frame(
  worker_id=worker_number,
  fname='Laura',
  lname='Jones',
  email='[email protected]',
  phone='15056221212',
  worker_hourly_wage=18,
  worker_hourly_tax=3.60
  ))

worker_number <- get_next_worker_id()
worker_df <- worker_df %>% bind_rows(data.frame(
  worker_id=worker_number,
  fname='Jared',
  lname='Wilson',
  email='[email protected]',
  phone='15056221213',
  worker_hourly_wage=16,
  worker_hourly_tax=3.20
  ))

worker_number <- get_next_worker_id()
worker_df <- worker_df %>% bind_rows(data.frame(
  worker_id=worker_number,
  fname='Lawrence',
  lname='Davis',
  email='[email protected]',
  phone='15056221214',
  worker_hourly_wage=16.80,
  worker_hourly_tax=3.40
  ))

worker_df
simple_plot <- ggplot() + 
  geom_col(data=worker_df, aes(x=paste(fname,lname),y=worker_hourly_wage,fill=paste(fname,lname))) + 
   ggtitle('Worker Wages') + labs(x='Worker',y='Wage per Hour')
 
 simple_plot

Calculating employee time in cost of product

Now that we have determined the cost for the raw materials - what about the labor involved in making the product - this should also be considered part of the cost. Since different employees get paid different amounts - w can either take the average cost of each item, or try to track the batch by batch costs using first in first out (FIFO) or last in first out (LIFO).

#set the worker's hourly wage & taxes you have to pay
worker_hourly_wage <- 16
worker_hourly_tax <- 3.20

#set the time to make each batch
batch_time_hr <- .5

#set the time to make each batch
batch_labor_cost <- batch_time_hr * (worker_hourly_wage + worker_hourly_tax) 
batch_labor_cost
## [1] 9.6
per_item_labor <- batch_labor_cost / 20
per_item_labor
## [1] 0.48

Calculate accurate cost as batches are created

Each time we create a batch of products, we will enter this in a productivity tracker table. This will help us recall: A. What were the ingredient costs of that specific batch B. What were the labor costs of that specific batch

batch_id_seed <- 1001
batch_df <- data.frame()

get_next_batch_id <- function(){
  next_batch_id <- batch_df %>% arrange(-batch_id) %>% slice(1) %>% pull(batch_id)
  next_batch_id <- next_batch_id + 1
  next_batch_id
}


add_batch <- function(
  batch_id=NULL,
  batch_date=Sys.time(),
  recipe_id=NULL,
  worker_id=NULL,
  batch_time_hr=NULL){
 
  batch_df <<- batch_df %>% bind_rows(data.frame(
  batch_id=batch_id,
  batch_date=batch_date,
  recipe_id=recipe_id,
  worker_id=worker_id,
  batch_time_hr=batch_time_hr
  ))
  
}

batch_number <- batch_id_seed
batch_df <- batch_df %>% bind_rows()

batch_number <- batch_id_seed
add_batch(
  batch_id=batch_number,
  batch_date=Sys.time(),
  recipe_id=1,
  worker_id=1001,
  batch_time_hr=.65
  )

batch_number <- get_next_batch_id()
add_batch(
  batch_id=batch_number,
  batch_date=Sys.time(),
  recipe_id=2,
  worker_id=1002,
  batch_time_hr=.75
  )

batch_number <- get_next_batch_id()
add_batch(
  batch_id=batch_number,
  batch_date=Sys.time(),
  recipe_id=3,
  worker_id=1003,
  batch_time_hr=.5
  )

batch_df

Having created a few batches, we can start to analyze the cost of each batch in terms of both materials and labor.

batch_cost_df <- batch_df %>% 
  left_join(recipe_summary_df, by="recipe_id") %>% 
  left_join(worker_df, by="worker_id") %>%
  mutate(batch_wage_cost = batch_time_hr * worker_hourly_wage,
         batch_tax_cost = batch_time_hr * worker_hourly_tax,
         each_wage_cost = (batch_time_hr * worker_hourly_wage)/batch_size,
         each_tax_cost = (batch_time_hr * worker_hourly_tax)/batch_size
         
         )

Now we have added some labor costs into the cost of goods sold.

simple_plot <- ggplot() + 
  geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"batch price"),y=price_batch,label=recipe_id,fill='Price')) + 
  geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"batch qty"),y=batch_size,label=recipe_id,fill='Qty')) + 
  ggtitle('Batch Costs by Recipe') + labs(x='Batch Price ($) and Units Per Batch (#)',y='Total')
 
 simple_plot

simple_plot <- ggplot() + 
  geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"each(matl)"),y=each_cost,label=recipe_id,fill='Materials')) + 
  geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"each(labor)"),y=each_wage_cost,label=recipe_id,fill='Labor')) + 
   ggtitle('Unit Costs by Recipe') + labs(x='Material Cost ($) and Labor Cost ($) per Unit',y='Total')
 
 simple_plot

R has convenient functions for working with excel (xlsx) files. These files make a great way to create an application out of pure R without a database of any kind. The excel files become your application database, enabling your data to be permanently stored and accessed outside the R environment.

# TO read data from an Excel file or Workbook object into a data.frame:
#df <- read.xlsx('name-of-your-excel-file.xlsx')

#To write a data.frame or list of data.frames to an xlsx file:
#?write.xlsx

acct_worksheets <- list(
          "General Ledger" = ledger_df,
          "Chart of Accounts" = accounts_df, 
          "Balance Sheet" = accts_total, 
          "Cashflow Statement" = cashflow_df, 
          "Income Statement" = income_df 
          )
write.xlsx(acct_worksheets, file = "Accounting.xlsx", colNames = TRUE, borders = "columns")

recipe_worksheets <- list(
          "Recipes" = recipe_df,
          "Ingredients" = ingred_df, 
          "Recipe Parts" = recipeparts_df, 
          "Recipe Costs" = recipe_cost,
          "Units of Measure" = unit_df
          )
write.xlsx(recipe_worksheets, file = "Recipes.xlsx", colNames = TRUE, borders = "columns")

prod_worksheets <- list(
          "Workers" = worker_df,
          "Batches" = batch_cost_df
          )
write.xlsx(prod_worksheets, file = "Production.xlsx", colNames = TRUE, borders = "columns")

invoice_worksheets <- list(
          "Invoices" = invoice_header_df,
          "Invoice Items" = invoice_df,
          "Customers" = customer_df 
          )
write.xlsx(invoice_worksheets, file = "Invoices.xlsx", colNames = TRUE, borders = "columns")

Having saved the files that store all this key data, you may now use Excel to see your ledger, balance sheet and other key reports at any time.

Files Generated by this Document

Since this document is focused on generating excel files, we have included links to the excel files generated below. These files act as the permanent database for your accounting project.

Accounts List link

Invoices List link

Production List link

Recipes List link

Further Study / Next Steps

If you were to try to use R for long term accounting - you would need more interactivity in your accounting system. You need ways to, for example, add customers, create invoices, and process sales without writing a ton of code every time. You need reusable interfaces.

Check back for a future installment of this e-book to see how these basoic accounting concepts could be transferred to a more practical and re-usable system of double-entry bookkeeping.


This concludes “An Accounting Companion in R Markdown” - Thanks for reading, and please check out our web site for more installments:

In “A Finite Math Companion in R Markdown”, we explore more fully the subjects of finite mathematics including sets, financial formulas, interest and annuities, matrix operations, and more.

In “A Statistics Companion in R Markdown”, we will explore more fully the subjects of probability, statistics, and additional applications of expected value, including subjects of linear modeling and inference. We will look at how to use R to generate histograms, ogives, box plots, and other statistical charts.

In “A Linear Algebra Companion in R Markdown” we will more thoroughly explore the subject of vector spaces, and how these to visualize vectors geometrically using the principles of Linear Algebra. This guide will explore more about the inner product, number spaces, and the concepts of magnitude and orthogonality.

Appendix: Installing R

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

To get the latest version of R, please visit:

https://www.r-project.org/

Working with R is much easier and faster when you use the RStudio IDE. This is the environment used to create this R Markdown file.

https://www.rstudio.com/

Packages used in this document are available from the Comprehensive R Archive Network (CRAN):

http://cran.us.r-project.org

For more info on R packages visit:

https://cran.r-project.org/web/packages/

Notes on Publishing RMarkdown

This document was generated by RMarkdown using R, R Studio, R Markdown, and LaTeX, and rendered to HTM via the Knit command and KnitR utility.

Additional formatting has been applied to results variables using the kable and kableExtra packages. These code blocks are not shown in the output, for readability.