# Recent Ingenuity

## Histogram in R

Histogram in R After a lot of finessing, code in R for a really great Histogram     library(ggplot2) library(formattable) library(scales) # http://t-redactyl.io/blog/2016/02/creating-plots-in-r-using-ggplot2-part-7-histograms.html windowsFonts(Tahoma=windowsFont(“Tahoma”)) lengthselect <- flist_widget[flist_widget\$length==10,] lengthselect summary(lengthselect) barfill <- “cyan3” barlines <- “#1F3552” meanprice <- mean(lengthselect\$price) medianprice <- currency(median(lengthselect\$price), digits=0L) sdprice <- currency(sd(lengthselect\$price), digits=0L) rangeprice <- currency(range(lengthselect\$price), digits=0L) minprice <- currency(min(lengthselect\$price), digits=0L) maxprice

## R Coding

Coding in R A few snipits of code that are always useful: Finding which column has NA values: unlist(lapply(dataframe , function(x) any(is.na(x))))

## Machine Learning: Charity Donor Analysis

Machine Learning:  Charity Donor Analysis Introduction A charitable organization wishes to develop a machine learning model to improve the cost effectiveness of their direct marketing campaigns to previous donors. The recent mailing records reflect an overall 10% response rate with an average donation of \$14.50. The cost to produce and send each mail is \$2.

## Text Analytics in R – Internet of Things (IoT)

Internet of Things (IoT) Text Analytics in R A small corpus of ten articles related to the Internet of Things (IoT) were collected for the purpose of text analytics.  Using R, each article was cleaned for unusual characters, changed to lower case,  removed numbers, punctuation, stop words, white space along with any additional terms that

## Factor Analysis to Identify Sectors

Factor Analysis Introduction Utilizing a stock portfolio data set and a factor analysis to identify sectors in the stock market, we will transform the variables into log values to explain the variation in the log-returns of the stocks and market index.  We will begin the factor analysis by performing a Principal Factor Analysis without a

## Principal Components Analysis

Principal Components Analysis Utilizing a stock portfolio data set and the Principal Components Analysis as a method in reducing dimension and as a remedial measure for multicollinearity in Ordinary Least Squares regression.  Beginning with the data, we will transform the variables into log values to explain the variation in the log-returns of the stocks and

## Automated Variable Selection

Automated Variable Selection The Amex, Iowa housing data set build has been utilized to develop various iterative regression models to determine the mean sales price of a house based on numerous variables. The variables range correlated, continuous variables to categorical variables. In this installment, we continue building the model using raw categories and later, the

## Data Variables and Analytical Models

Data Variables and Analytical Models Before diving in to a statistical analysis of any dataset, spending the requisite time to understand the data, checking the quality and taking a look ‘under the dash’ is essential.  Below, we will examine the data variables and analytical models on a housing prices as a first step in predicting