Size: a a a

2019 December 04

RI

Rustam Iksanov in Data Engineers
Nikita Blagodarnyy
А зачем вообще скала? По петону полно пдфок.
Есть книга одерски. Есть несколько гайдов в свободном доступе. Есть курсы. Не понимаю зациклинности именно на одной книги.
источник

РА

Рамиль Ахмадеев in Data Engineers
кто-то авторитетный посоветовал именно эту книгу
источник

AZ

Anton Zadorozhniy in Data Engineers
за 30 долларов в год (для резидентов РФ) вступаете в ACM, там бОльшая часть O'Reilly Learning доступна (то что раньше называлось Safari Books Online)
источник

AZ

Anton Zadorozhniy in Data Engineers
и хорстман там к слову есть
источник

AS

Anton Shelin in Data Engineers
а я такую читал. в общем ок. компактно и без воды http://shop.oreilly.com/product/0636920028512.do
источник

AZ

Anton Zadorozhniy in Data Engineers
источник

AS

Andrey Smirnov in Data Engineers
Anton Zadorozhniy
за 30 долларов в год (для резидентов РФ) вступаете в ACM, там бОльшая часть O'Reilly Learning доступна (то что раньше называлось Safari Books Online)
по цене одной книги годовая подписка?
источник

AZ

Anton Zadorozhniy in Data Engineers
Andrey Smirnov
по цене одной книги годовая подписка?
Вам не дадут скачать (читать только в браузере или приложениях), но да, если вы читаете больше одной книги - это выгодно
источник

AZ

Anton Zadorozhniy in Data Engineers
Ну и АСМ это прекрасная организация, всем рекомендую
источник

AS

Andrey Smirnov in Data Engineers
Anton Zadorozhniy
Вам не дадут скачать (читать только в браузере или приложениях), но да, если вы читаете больше одной книги - это выгодно
спасибо, не знал про такую возможность
источник

ME

Mikhail Epikhin in Data Engineers
Anton Zadorozhniy
за 30 долларов в год (для резидентов РФ) вступаете в ACM, там бОльшая часть O'Reilly Learning доступна (то что раньше называлось Safari Books Online)
вот это поворот, а я подписку oreilly за 200 баксов покупаю
источник

T

T in Data Engineers
Anton Zadorozhniy
за 30 долларов в год (для резидентов РФ) вступаете в ACM, там бОльшая часть O'Reilly Learning доступна (то что раньше называлось Safari Books Online)
там все книги из safari books доступны?
источник

SO

Simon Osipov in Data Engineers
А вы успеваете прочитать столько? Я заношут время от времени 15 енотов в Humble Bundle и там орайли и пакт с головой хватает..
источник

T

T in Data Engineers
или есть ограничения?
источник

P

Pavel in Data Engineers
Flink и локальный K8S, job cluster
всё по мануалу, не взлетает, наверняка ошибка новичка.
у кого было похожее?
```Starting standalonejob as a console application on host f-jobmanager-j5776.
2019-12-04 15:13:43,101 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
2019-12-04 15:13:43,102 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Starting StandaloneJobClusterEntryPoint (Version: <unknown>, Rev:4d56de8, Date:30.09.2019 @ 11:32:19 CST)
2019-12-04 15:13:43,102 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  OS current user: flink
2019-12-04 15:13:43,102 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Current Hadoop/Kerberos user: <no hadoop dependency found>
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.232-b09
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Maximum heap size: 989 MiBytes
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JAVA_HOME: /usr/local/openjdk-8
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  No Hadoop Dependency available
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  JVM Options:
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Xms1024m
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Xmx1024m
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Program Arguments:
2019-12-04 15:13:43,103 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     --configDir
2019-12-04 15:13:43,104 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     /opt/flink/conf
2019-12-04 15:13:43,104 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Djobmanager.rpc.address=f-jobmanager
2019-12-04 15:13:43,104 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dparallelism.default=1
2019-12-04 15:13:43,104 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dblob.server.port=6124
2019-12-04 15:13:43,104 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -     -Dqueryable-state.server.ports=6125
2019-12-04 15:13:43,104 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         -  Classpath: /opt/flink/lib/event-processing.jar:/opt/flink/lib/flink-table-blink_2.11-1.9.1.jar:/opt/flink/lib/flink-table_2.11-1.9.1.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.11-1.9.1.jar:::
2019-12-04 15:13:43,104 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - --------------------------------------------------------------------------------
2019-12-04 15:13:43,105 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Registered UNIX signal handlers for [TERM, HUP, INT]
2019-12-04 15:13:43,164 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2019-12-04 15:13:43,164 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-12-04 15:13:43,165 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-12-04 15:13:43,165 INFO  org.apache.flink.configuration.GlobalConf
источник

P

Pavel in Data Engineers
iguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-12-04 15:13:43,165 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-12-04 15:13:43,165 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2019-12-04 15:13:43,165 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.execution.failover-strategy, region
2019-12-04 15:13:43,188 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Starting StandaloneJobClusterEntryPoint.
2019-12-04 15:13:43,188 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Install default filesystem.
2019-12-04 15:13:43,199 INFO  org.apache.flink.core.fs.FileSystem                           - Hadoop is not in the classpath/dependencies. The extended set of supported File Systems via Hadoop is not available.
2019-12-04 15:13:43,210 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Install security context.
2019-12-04 15:13:43,215 INFO  org.apache.flink.runtime.security.modules.HadoopModuleFactory  - Cannot create Hadoop Security Module because Hadoop cannot be found in the Classpath.
2019-12-04 15:13:43,221 INFO  org.apache.flink.runtime.security.SecurityUtils               - Cannot install HadoopSecurityContext because Hadoop cannot be found in the Classpath.
2019-12-04 15:13:43,222 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Initializing cluster services.
2019-12-04 15:13:43,327 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Trying to start actor system at f-jobmanager:6123
2019-12-04 15:13:43,603 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2019-12-04 15:13:43,622 INFO  akka.remote.Remoting                                          - Starting remoting
2019-12-04 15:13:43,696 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink@f-jobmanager:6123]
2019-12-04 15:13:43,736 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Actor system started at akka.tcp://flink@f-jobmanager:6123
2019-12-04 15:13:43,747 INFO  org.apache.flink.configuration.Configuration                  - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address'
2019-12-04 15:13:43,751 INFO  org.apache.flink.runtime.blob.BlobServer                      - Created BLOB server storage directory /tmp/blobStore-af97e4e6-8d24-4f58-ab60-41c597894ae7
2019-12-04 15:13:43,754 INFO  org.apache.flink.runtime.blob.BlobServer                      - Started BLOB server at 0.0.0.0:6124 - max concurrent requests: 50 - max backlog: 1000
2019-12-04 15:13:43,761 INFO  org.apache.flink.runtime.metrics.MetricRegistryImpl           - No metrics reporter configured, no metrics will be exposed/reported.
2019-12-04 15:13:43,763 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Trying to start actor system at f-jobmanager:0
2019-12-04 15:13:43,777 INFO  akka.event.slf4j.Slf4jLogger                                  - Slf4jLogger started
2019-12-04 15:13:43,784 INFO  akka.remote.Remoting                                          - Starting remoting
2019-12-04 15:13:43,790 INFO  akka.remote.Remoting                                          - Remoting started; listening on addresses :[akka.tcp://flink-metrics@f-jobmanager:34677]
2019-12-04 15:13:43,796 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Actor system started at akka.tcp://flink-metrics@f-jobmanager:34677
2019-12-04 15:13:43,800 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC endpoint for org.apache.flink.runtime.metrics.dump.MetricQueryService at akka://flink-metrics/user/MetricQueryService .
2019-12-04 15:13:43,866 INFO  org.apache.flink.configuration.Configuration                  - Config uses fa
источник

P

Pavel in Data Engineers
ResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
... 6 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: The program plan could not be fetched - the program aborted pre-maturely.

System.err: (none)

System.out: >>> Preparing Flink...

at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:108)
at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:80)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:99)
... 9 more
`
источник

P

Pavel in Data Engineers
llback configuration key 'jobmanager.rpc.address' instead of key 'rest.address'
2019-12-04 15:13:43,868 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Upload directory /tmp/flink-web-ca93d345-1b42-4740-a845-b415e2b12253/flink-web-upload does not exist.
2019-12-04 15:13:43,868 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Created directory /tmp/flink-web-ca93d345-1b42-4740-a845-b415e2b12253/flink-web-upload for file uploads.
2019-12-04 15:13:43,886 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Starting rest endpoint.
2019-12-04 15:13:44,031 WARN  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - Log file environment variable 'log.file' is not set.
2019-12-04 15:13:44,031 WARN  org.apache.flink.runtime.webmonitor.WebMonitorUtils           - JobManager log files are unavailable in the web dashboard. Log file location not found in environment variable 'log.file' or configuration key 'Key: 'web.log.path' , default: null (fallback keys: [{key=jobmanager.web.log.path, isDeprecated=true}])'.
2019-12-04 15:13:44,115 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Rest endpoint listening at f-jobmanager:8081
2019-12-04 15:13:44,116 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - http://f-jobmanager:8081 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000
2019-12-04 15:13:44,116 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Web frontend listening at http://f-jobmanager:8081.
2019-12-04 15:13:44,164 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC endpoint for org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at akka://flink/user/resourcemanager .
2019-12-04 15:13:44,172 INFO  org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever  - Scanning class path for job JAR
2019-12-04 15:13:44,175 INFO  org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever  - Using /opt/flink/lib/event-processing.jar (entry class: io.xxxxxxxxxxx.processing.TestJob) as job jar
2019-12-04 15:13:44,195 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Shutting down rest endpoint.
2019-12-04 15:13:44,206 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Removing cache directory /tmp/flink-web-ca93d345-1b42-4740-a845-b415e2b12253/flink-web-ui
2019-12-04 15:13:44,210 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - http://f-jobmanager:8081 lost leadership
2019-12-04 15:13:44,210 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint  - Shut down complete.
2019-12-04 15:13:44,213 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Shutting StandaloneJobClusterEntryPoint down with application status FAILED. Diagnostics org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:210)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:164)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:163)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:501)
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:110)
Caused by: org.apache.flink.util.FlinkException: Could not create the JobGraph from the provided user code jar.
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:109)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFacto
источник

P

Pavel in Data Engineers
ry.java:62)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
... 6 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: The program plan could not be fetched - the program aborted pre-maturely.

System.err: (none)

System.out: >>> Preparing Flink...

at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:108)
at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:80)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:99)
... 9 more
.
2019-12-04 15:13:44,215 INFO  org.apache.flink.runtime.blob.BlobServer                      - Stopped BLOB server at 0.0.0.0:6124
2019-12-04 15:13:44,215 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2019-12-04 15:13:44,228 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2019-12-04 15:13:44,230 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-12-04 15:13:44,230 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2019-12-04 15:13:44,231 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-04 15:13:44,232 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-04 15:13:44,249 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2019-12-04 15:13:44,255 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2019-12-04 15:13:44,264 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.
2019-12-04 15:13:44,268 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.
2019-12-04 15:13:44,269 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - Could not start cluster entrypoint StandaloneJobClusterEntryPoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneJobClusterEntryPoint.
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:182)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:501)
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:110)
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:210)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:164)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:163)
... 2 more
Caused by: org.apache.flink.util.FlinkException: Could not create the JobGraph from the provided user code jar.
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:109)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcher
источник

P

Pavel in Data Engineers
ох ё) простите))
источник