Now that you have imported your data into ACL Analytics, what’s next? Don’t start crunching numbers and analyzing your data just yet! Before performing any analysis, you will need to prepare your data. Preparing data is often an overlooked, yet crucial step in the data analysis process. You can plan and perfectly execute the most intricate, high-stakes data analysis—but your findings won’t be valuable if your data wasn’t accurate to begin with. In ACL, you can run a series of commands and functions to make sure your data is clean, accurate and formatted correctly.
In this blog post, we will take a closer look at five ACL functions that can help you prepare and cleanse your data.
Before we begin… what is a function?
A function is a specific task or calculation that can prepare, amend, and manipulate your data. For example, you can use the DATE() function to format date entries in your data to DD/MM/YYYY or MM/DD/YYYY.
Here are five commonly used ACL functions and their syntax that can help with data cleansing:
1. ISBLANK() – returns a logical value indicating whether the input value is blank.
2. MAP() – returns a logical value indicating if a character string matches a specified format string containing wildcard characters, literal characters, or both.
3. BETWEEN() – returns a logical value indicating whether the specified value falls within a range.
BETWEEN(value, min, max)
4. REPLACE() – replaces all instances of a specified character string with a new character string.
REPLACE(string, old_text, new_text)
5. ALLTRIM() – returns a string with leading and trailing spaces removed from the input string.
So how do we use these functions to cleanse our data?
Let’s take a look at an example:
You have just finished importing your vendor master file into ACL Analytics and you need to cleanse your data set.
You want to:
- Remove records with blank fields with ISBLANK()
- Ensure that all of the Tax ID numbers are in the correct format using MAP()
- Filter the list down to only 2018 dates with BETWEEN()
- Regulate all of the address fields with REPLACE()
- Find and remove any unnecessary spaces using ALLTRIM()
FUNCTION 1 – ISBLANK()
STEP 1: Click the Edit View Filter button to the the right of the Filter and Quick Search bar to bring up the Expression Builder.
STEP 2: Select your function, ISBLANK(), from the functions list on the bottom right. Type NOT in front of the function to filter out the records with blank entries and retain only the records with data in them.
STEP 3: Replace string with the name of the field that you need to remove blanks from. In this example, we replace string with the field Vendor_State.
STEP 4: Click the Verify button to check if your expression is valid. Then click OK to perform the function on your data.
FUNCTION 2 – MAP()
Next up, let’s check all of the vendor’s tax IDs to make sure they are in the same format (e.g. 999-99-9999).
IMPORTANT NOTE: Always build onto your existing functions by adding the AND operator to the end of your previous expression. This tells Analytics to act out both functions. You can also use the OR operator to tell Analytics to act out either function, depending on your needs.
STEP 1: From the Expression Builder, select the MAP() function.
STEP 2: Replace string with Tax_ID field, and replace format with “999-99-9999”.
STEP 3: Click the Verify button to check if your expression is valid. Then click OK to perform both functions on your data.
FUNCTION 3 – BETWEEN()
Now let’s filter the list down to review dates within 2018 only. Again, remember to add the AND operator after the previous expression.
STEP 1: From the Expression Builder, select BETWEEN().
STEP 2: Replace value with the Vendor_Next_Review_Date field, replace min with `20180101`, replace max with `20181231`.
STEP 3: Click the Verify button to check if your expression is valid. Then click OK to perform the functions on your data.
FUNCTION 4 – REPLACE()
Now you’ve noticed that some of the street names in the Vendor_Street field are inconsistent; for example, Ave. should be Avenue. You can fix this with the REPLACE() function.
This time instead of filtering you need to create a computed field. (Click here for a refresher on what a computed field is)
STEP 1: Open Table Layout (Ctrl + I) and select the Add a New Expression button on the left.
STEP 2: Name the computed field c_Vendor_Street. Now click on the f(x) button and the Expression Builder will pop up. Similar to the above steps, select the REPLACE() function in the list.
STEP 3: Change string to Vendor_Street field, change old_text to “Ave.”, and change new_text to “Avenue”.
- It should look like this: REPLACE (Vendor_Street, “Ave.”, “Avenue”)
STEP 4: Click the Verify button to check if your expression is valid. Then click OK to perform the functions on your data.
STEP 5: You will notice that nothing changes in the data because you have created a new field, rather than filtering out your existing data. In order to display the new computed field, right click anywhere in the data and select Add Columns. Select your new field and click the right arrow to add it to your data view. You can also hide the old field by selecting the column, right click and select Remove Selected Columns.
FUNCTION 5 – ALLTRIM()
Finally, let’s remove any unnecessary leading or trailing spaces from the Vendor_City field.
STEP 1: Create a computed field, name it, and select ALLTRIM() from the Expression Builder.
STEP 2: Replace string with Vendor_City.
STEP 3: Click the Verify button to check if your expression is valid. Then click OK to perform the function on your data.
STEP 4: Right click to add in your new column and again to hide the old column.
Great job, you are well on your way to clean data for you to analyze. The five functions we have explained here are only a small selection of the functions you can perform. To see what else you can do, take a look at our Help Doc page “Top 30 ACL functions.”
Frequently Asked Questions
Here are some Frequently Asked Questions for using ACL functions:
1. How can I prepare my data without altering my original data set?
Performing any actions in ACL Analytics, including running a function, will never alter your source data.
2. I just finished performing a function but when I try to run another function, my first function was gone. How can I run concurrent functions in the same data set?
Simply add the AND operator between each function. For example: NOT ISBLANK (Vendor_State) AND MAP (Tax_ID, “999-99-9999”). You can also use the OR operator to perform one function or another, depending on what you want to achieve.
3. I want to learn more ACL functions. Where can I advance my knowledge?
The easiest way to learn what any function does is to play around with it in the ACL command line. You can try by following the examples in the “Familiarizing with different functions” page in our Help Docs.
Tip of the Month
Looking for more of a challenge? You can run multiple functions at once by nesting one function inside another. This can achieve results that you couldn’t achieve with either function alone. For a more detailed explanation, visit the “Cleaning and filtering data at the same time” Help Docs page.
New “Basics to ACL GRC Analytics Exchange” course available in ACL Academy!
You asked, we answered. The new Basics to ACL GRC Analytics Exchange course offers bite-sized video tutorials to get you up to speed with the essentials of Analytics Exchange. Learn to manage users, create data containers, convert ACL Analytics scripts for Analytics Exchange, import projects from ACL Analytics into Analytics Exchange, and run an Analytic in Analytics Exchange.
Interested to learn about other useful ACL functions in person? We are offering two “Top 20 ACL Analytics Functions” training workshops at ACL Connections 2018, our largest training event of the year. The best part is that you can still take advantage of the early-bird pricing if you register before June 30th!