This section, intended for the more technically minded readers, will deal with real data analysis problems and their solutions – mostly in R.
I am not maintaining a discussion section for the moment. Please email me with any suggestions or questions you have.
Let’s start with a few observations about linear models. The usual discussion posits a target variable y
of
length N and a design matrix X
with p columns (of full rank, say) and N rows. The linear model equation can be written
and the least squares solution is obtained by projecting y
orthogonally onto the image of X
giving the almost iconic formula in the title for the estimated parameters.
The data set on German fuel prices contains the fuel prices, but not the sales, from more than 14000 fuel stations in Germany since June 2014. It is made available by the webservice Tankerkoenig as a Postgres dump (from June 2014 onwards) under CC4.0.
purrr
Today, I want to present a simple way to use purrr
to create a static website generator. Of course, there is Jekyll, Hugo and the blogdown
package, but in many cases you may find yourself, like I did, with your own bits and pieces of html, no time to learn yet another language, and in need of a way to put all this into a structured and consistent static website.
rep
functionTo get this R blog series started, I want to present a little data preparation procedure which comes up often in the analysis of time series.
We are used to thinking of time series as somehow being regularly spaced. For instance, we may be looking at preaggregated data, like daily or weekly total sales, or measurements being made at predefined points in time, like closing prices of a stock or the number of companies in business at the end of a given month.
Boris Vaillant - Quantitative Consulting 17
QC 17