Data Hack Tuesday

Encoding Text

October 27, 2020

Here’s a great way to encode 1GB of text in under 20 sec using HuggingFace Tokenization shorturl.at/eixZ4

View More

Here’s a great way to encode 1GB of text in under 20 sec using HuggingFace Tokenization shorturl.at/eixZ4

Generic codebase

October 20, 2020

Improve the speed of your data hacks by creating a reproducible generic codebase. shorturl.at/gxIR6

View More

Improve the speed of your data hacks by creating a reproducible generic codebase. shorturl.at/gxIR6

Columns

October 13, 2020

Apply separate transformations on every column using Sklearn’s ColumnTransformer https://scikit-learn.org/stable/modules/generated/sklearn.compose.ColumnTransformer.html

View More

Apply separate transformations on every column using Sklearn’s ColumnTransformer https://scikit-learn.org/stable/modules/generated/sklearn.compose.ColumnTransformer.html

Intro to linear regression

October 6, 2020

Perform linear regression by tapping into Python’s indispensable sklearn library https://scikit-learn.org/stable/index.html using the make_regression() function. Now, you can completely control […]

View More

Perform linear regression by tapping into Python’s indispensable sklearn library https://scikit-learn.org/stable/index.html using the make_regression() function. Now, you can completely control your data’s behavior whether you want a mini random data set or need to debug your algorithm.

Descriptive, prescriptive, or predictive?

September 29, 2020

What type of analytics makes the most sense for your organization? Naturally, this depends on the project and on your […]

View More

What type of analytics makes the most sense for your organization? Naturally, this depends on the project and on your goals. The three main types of analytics that you will want to consider are:

  • Descriptive
  • Predictive
  • Prescriptive

Here’s a handy reminder of how to apply each of these.

Prescriptive analytics will help you to answer the question of what your organization should do. Compare this to descriptive analytics which is the method for understanding your current situation or problem and reviewing how it looked in the past. Unlike prescriptive analytics, descriptive will not address a question specifically and will instead present a clear overview of the problem. This differs considerably from predictive analytics which involves making prediction about future behaviors or results based on current data and trends.

Data Security

September 22, 2020

Thanks to the sudden increase in remote work, companies must now confront a new set of security challenges. One small […]

View More

Thanks to the sudden increase in remote work, companies must now confront a new set of security challenges. One small way to help protect company data is to avoid downloading data to spreadsheets and Excel files.

While this is common enough, a few problems can result from downloading data to a spreadsheet:

-It is no longer possible to control how the data is used or with whom it is shared.

-The files could be exploited.

-The risk of confidential information becoming exposed is greatly increased.

DSUM of all things

September 15, 2020

Today’s Data Hack takes you back to basics to make the most out of Excel. You don’t need to be […]

View More

Today’s Data Hack takes you back to basics to make the most out of Excel. You don’t need to be a scientist in order to sort and analyze data in a basic way. Check out this short overview of database functions that will help you parse information from lists:

DCOUNT – Count the number of cells with values

DMAX – Finds the largest value in a list

DMIN – Finds the smallest value in a list

DSUM – Calculate the sum of values matching criteria

These functions as use a three-argument syntax. Here is an example:

=DAVERAGE(database,field,criteria)

Method or Function?

September 8, 2020

Today’s Data Hack is a helpful reminder of the difference between methods and functions in Python and when to use […]

View More

Today’s Data Hack is a helpful reminder of the difference between methods and functions in Python and when to use them. Let’s begin with methods:

While a method is called by its name, it is still associated with an object (dependent). This may or may not return data. Another important feature is that a method can operate data contained by the corresponding class. Here is an example:

Basic Python

class class_name
def method_name():
…………
# method body
…………

Method class

class Meth:
def method_meth (self):
print (“This is a method_meth of Meth class.”)
class_ref = Meth() #object of Meth class
class_ref.method_meth()

This differs from functions, which are blocks called by their name (independents). A function can have different parameters and if any data is passed, then it is done quickly. The function, however, does not interact with class. Here is an example:

def function_name(arg1, arg2, …):

#function body

In other words, methods and functions look similar but the central difference is in “Class and its Object.” A function can only be called by its name and is defined independently. For a method, you must invoke the class by reference of the class in which it is defined.

Style with Conditional Formatting

September 1, 2020

If you are looking to adjust the visual styling of a DataFrame, then you should definitely try applying conditional formatting. […]

View More

If you are looking to adjust the visual styling of a DataFrame, then you should definitely try applying conditional formatting. This can be done using the DataFrame.style property. With this property, you will see a Styler object returned which can be useful for formatting and displaying DataFrames. All of this is done using CSS by writing “style functions” that will take scalars, DataFrames, or Series, and come back with like-indexed DataFrames or Series with CSS: “attribute: value” pairs for the values. We recommend looking more deeply into Styler to learn more about applying styling functions to a DataFrame in different ways.

Close