While natural language interfaces (NLIs) integrated in data visualiza- tion tools are an opportunity to facilitate an analytical flow through conversation, they still exhibit unexpected system behavior due to ambiguities in the conversation between users and the data visualiza- tion tool. In our initial natural language (NL) elicitation study, we found that for over 70% of NL inputs that exhibited ambiguities, the goal of users could be clarified through contextual conditions, such as the current data fields selected in the data visualization. However, there are numerous challenges in deriving these contextual condi- tions by developers upfront or automatically by the system during actual use. Instead, we propose ContexIT, a mixed-initiative system that is able to continuously learn the contextual conditions for NL inputs based on the visualization state and clarifications from the actual users.