Tag Archives: Cross Validated

What is the output of a tf.nn.dynamic_rnn()?

I am not sure about what i understand from the official documentation. which says: Returns: A pair (outputs, state) where: outputs: The RNN output Tensor. If time_major == False (default), this will be a Tensor shaped: [batch_size, max_time, cell.output_size]. If time_major == True, this will be a Tensor shaped: [max_time, batch_size, cell.output_size]. Note, if cell.output_size… Read More »

Regression Calibration in R

Stata has a function called rcal (see The Regression Calibration Method for Fitting Generalized Linear Models with Additive Measurement Error) that can be used to perform a regression when there is measurement error in the covariates using the regression calibration approach. Can anyone recommend a package that gives an equivalent function in R?

multiclass classification having class imbalance with Gradient Boosting Classifier

I am using Abalon data for classification from UCI(https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data). I have scaled data and used TSNE for visualization. data=pd.read_csv(‘http://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data’) x=data.drop(’15’, axis=1) y=data[’15’] import matplotlib as plt mapping={‘M’:0,’I’:1,’F’:2}`x[‘M’].replace(mapping,inplace=True)` from sklearn.preprocessing import StandardScaler sc=StandardScalar() x_scaled=sc.fit_transform(x) from sklearn.manifold import Isomap,TSNE sne=TSNE(n_components=2) x_red_sne=sne.fit_transform(x_scaled) plt.scatter(x=x_red_sne[:,0],y=x_red_sne[:,1],c=data[’15’],cmap=’spectral’) from sklearn.ensemble import GradientBoostingClassifier from sklearn.cross_validation import cross_val_score,train_test_split from sklearn.metrics import classification_report,f1_score x_train,x_test,y_train,y_test=train_test_split(x_scaled,y,train_size=.7) gb=GradientBoostingClassifier(n_estimators=200,learning_rate=.1) gb.fit(x_train,y_train)… Read More »

What distribution does my data follow?

Let us say that I have 1000 components and I have been collecting data on how many times these log a failure and each time they logged a failure, I am also keeping track of how long it took my team to fix the problem. In short, I have been recording the time to repair… Read More »

regression with rounded covariates

A basic assumption of something as simple as OLS regression is that the covariates are continuous. Often this is only partly the case, e.g. heart rate is often reported in integer number of bpm, yet the underlying value is clearly continuous, not discrete. When doing kernel density estimation, there are suggestions for how to deal… Read More »

sampling from an unnormalised distribution

If one has to sample from a population $(x_1,x_2,ldots)$ with weights $(omega_1,omega_2,ldots)$, possibly infinite, a standard simulation procedure is to sum these weights $omega_i$ into $mathfrak{s}=sum_iomega_i$, sort the weights $omega_i$ and generate a Uniform $U(0,mathfrak{s})$ $iota$ that will compare with the cumulated weights$$iotaleomega_1,,iotaleomega_1+omega_2,,ldots$$until it meets the inequality. However, this can prove very costly when the… Read More »