About 212,000 results
Open links in new tab
  1. How to Execute sql queries in Apache Spark - Stack Overflow

    Dec 12, 2020 · 0 Its rather simple now in spark to do SQL queries. You can do SQL on dataframes as pointed out by others but the questions is really how to do SQL. spark.sql("SHOW TABLES;") that's it.

  2. How to get day of week in Spark SQL? - Stack Overflow

    SELECT * FROM mytable WHERE DATEPART(WEEKDAY, create_time) = 0 SELECT * FROM mytable WHERE strftime("%w", create_time) = 0 How to get day of week from date/timestamp in Spark SQL?

  3. apache spark - Trino can't connect to Nessie - Stack Overflow

    Feb 19, 2024 · I notice the catalog URI says localhost, that would communicate with the same container running Trino not the container running Nessie. Are you using docker compose to generate the …

  4. apache spark - PySpark streaming connect with AWS Kinesis …

    Nov 29, 2024 · I am trying to read an AWS Kinesis Data Stream into a PySpark sql dataframe. this is my python code import pyspark as ps spark = ( ps.sql.SparkSession.builder .config(map= { 'spark...

  5. PySpark, Py4JJavaError, Caused by: java.io.EOFException

    Feb 6, 2024 · I guess you created a dedicated environment to test your pyspark code. It's not a solution, but my suggestion is to try execute your code in the default environment (hence, you need to install …

  6. Different Methods for Creating EXTERNAL TABLES Using Spark SQL in ...

    Jun 18, 2022 · 3 I believe I understand the basic difference between Managed and External tables in Spark SQL. Just for clarity, given below is how I would explain it. A managed table is a Spark SQL …

  7. How do I run SQL SELECT on AWS Glue created Dataframe in Spark?

    May 21, 2019 · To execute sql queries you will first need to convert the dynamic frame to dataframe, register a temp table in spark's memory and then execute the sql query on this temp table.

  8. How apply a different timezone to a timestamp in PySpark

    Aug 27, 2021 · The reason is that, Spark firstly cast the string to timestamp according to the timezone in the string, and finally display the result by converting the timestamp to string according to the session …

  9. Read from AWS Redshift using Databricks (and Apache Spark)

    Feb 18, 2022 · Databricks Runtime Version: 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12) I've tried the same with JDBC redshift Driver (using URL prefix jdbc:redshift) Then I had to install …

  10. Create Spark SQL tables from multiple parquet paths

    May 31, 2018 · Spark uses Hive metastore to create these permanent tables. These tables are essentially external tables in Hive. Generally what you are trying is not possible because Hive …