EmbDev.net

Forum: Off Topic R Shiny: Creating factor variables and defining levels


von Tobi M. (tobimarsh43)


Rate this post
0 useful
not useful
I trying to create a machine learning application with Shiny.

In this application, the user can select the specifications of the input 
variables (via. input widgets) which will be used to give an estimate of 
a response variable. To do this, I have created a data frame from the 
selected inputs and save it locally as a data table.

A problem arises when I load the data table again since this will cause 
all categorical variables to become character variables. I can however 
change these manually with the factor() function and use the levels= 
specification.

The problem is that I don't want to manually do this every time a new 
dataset gets used, since most likely will change the position of the 
categorical variables in the dataset. There will also most likely not be 
the same amount of categorical variables in a new dataset.

The data frame "DATA" is the main dataset that contains the response 
variable in column 1.

The data frame "test" is the data frame constructed from the selected 
inputs, which will be used as the test set for prediction and will 
contain only 1 specified observation. This data frame will always have 
the response variable as the last column in the data frame, due to how 
the data frame is constructed. So the factor variable in DATA[,5] will 
always correspond to the previous column in the test data frame: 
test[,4].

It is the test data frame that needs the factor levels to be specified 
since it doesn't automatically know the number of categories when it 
only contains 1 observation.

test[4] <- factor(test[4], levels = unique(DATA[,5]))

I'm trying to write a code that will apply the factor function on all 
character variables in the dataset and specify the levels no matter the 
position of the character variable in the dataset.

Here is the code I have written so far:
library(shiny)
library(tidyverse)
library(shinythemes)
library(data.table)
library(RCurl)
library(randomForest)
library(mlbench)
library(janitor)


# Read data
DATA <- BostonHousing

# Rearrange data so the response variable is located in column 1
DATA <- DATA[,c(names(BostonHousing)[14],names(BostonHousing)[-14])]

# Creating a simple RF model
model <- randomForest(medv ~ ., data = DATA, ntree = 500, mtry = 4, importance = TRUE)


# UI -------------------------------------------------------------------------
ui <- fluidPage(
  
  sidebarPanel(
    
    h3("Parameters Selected"),
    br(),
    tableOutput('show_inputs'),
    hr(),
    actionButton("submitbutton", label = "calculate", class = "btn btn-primary", icon("calculator")),
    hr(),
    tableOutput("tabledata")

    
  ), # End sidebarPanel
  
  mainPanel(
    
    h3("Input widgets"),
    uiOutput("select")
    
  ) # End mainPanel
  
) # End UI bracket



# Server -------------------------------------------------------------------------
server <- function(input, output, session) {
  
# Create input widgets from dataset  
  output$select <- renderUI({
    df <- req(DATA)
    tagList(map(
      names(df[-1]),
      ~ ifelse(is.numeric(df[[.]]),
               yes = tagList(sliderInput(
                 inputId = paste0(.),
                 label = .,
                 value = mean(df[[.]], na.rm = TRUE),
                 min = round(min(df[[.]], na.rm = TRUE),2),
                 max = round(max(df[[.]], na.rm = TRUE),2)
               )),
               no = tagList(selectInput(
                 inputId = paste0(.),
                 label = .,
                 choices = sort(unique(df[[.]])),
                 selected = sort(unique(df[[.]]))[1],
               ))
      )
    ))
  })
  

# creating dataframe of selected values to be displayed
  AllInputs <- reactive({
    id_exclude <- c("savebutton","submitbutton")
    id_include <- setdiff(names(input), id_exclude)
    
    if (length(id_include) > 0) {
      myvalues <- NULL
      for(i in id_include) {
        myvalues <- as.data.frame(rbind(myvalues, cbind(i, input[[i]])))
        
      }
      names(myvalues) <- c("Variable", "Selected Value")
      myvalues %>% 
        slice(match(names(DATA[,-1]), Variable))
    }
  })

  
# render table of selected values to be displayed
  output$show_inputs <- renderTable({
    AllInputs()
  })
  
 
# Creating a dataframe for calculating a prediction
  datasetInput <- reactive({  
    
    df1 <- data.frame(AllInputs(), stringsAsFactors = FALSE)
    input <- transpose(rbind(df1, names(DATA[1])))

    write.table(input,"input.csv", sep=",", quote = FALSE, row.names = FALSE, col.names = FALSE)
    test <- read.csv(paste("input.csv", sep=""), header = TRUE)
    
    
# defining factor levels for factor variables
    test[4] <- factor(test[4], levels = unique(DATA[,5])) # <- This line will cause problems if multiple factors in dataset or if different column location
   

# Making the actual prediction and store it in a data.frame     
    Prediction <- predict(model,test)
    Output <- data.frame("Prediction"=Prediction)
    print(format(Output, nsmall=2, big.mark=","))
  })
  

# display the prediction when the submit button is pressed
  output$tabledata <- renderTable({
    if (input$submitbutton>0) { 
      isolate(datasetInput()) 
    } 
  })
  
  
} # End server bracket



# ShinyApp -------------------------------------------------------------------------
shinyApp(ui, server)

Pursuing ML course - 
https://intellipaat.com/machine-learning-certification-training-course/

Reply

Entering an e-mail address is optional. If you want to receive reply notifications by e-mail, please log in.

Rules — please read before posting

  • Post long source code as attachment, not in the text
  • Posting advertisements is forbidden.

Formatting options

  • [c]C code[/c]
  • [avrasm]AVR assembler code[/avrasm]
  • [code]code in other languages, ASCII drawings[/code]
  • [math]formula (LaTeX syntax)[/math]




Bild automatisch verkleinern, falls nötig